[MTCGA]: new failures in builds [6104243] needs to be handled

2021-07-27 Thread dpavlov . tasks
Hi Igniters,

 I've detected some new issue on TeamCity to be handled. You are more than 
welcomed to help.

 If your changes can lead to this failure(s): We're grateful that you were a 
volunteer to make the contribution to this project, but things change and you 
may no longer be able to finalize your contribution.
 Could you respond to this email and indicate if you wish to continue and fix 
test failures or step down and some committer may revert you commit. 

 *New test failure in master-nightly 
IgniteWalReaderTest.testIteratorWithCurrentKernelContext 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-4745133272315373633=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - maxim muzafarov  
https://ci.ignite.apache.org/viewModification.html?modId=931564
 - pavel pereslegin  
https://ci.ignite.apache.org/viewModification.html?modId=931553

 - Here's a reminder of what contributors were agreed to do 
https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute 
 - Should you have any questions please contact dev@ignite.apache.org 

Best Regards,
Apache Ignite TeamCity Bot 
https://github.com/apache/ignite-teamcity-bot
Notification generated at 02:44:28 28-07-2021 


[MTCGA]: new failures in builds [6104242, 6104211] needs to be handled

2021-07-27 Thread dpavlov . tasks
Hi Igniters,

 I've detected some new issue on TeamCity to be handled. You are more than 
welcomed to help.

 If your changes can lead to this failure(s): We're grateful that you were a 
volunteer to make the contribution to this project, but things change and you 
may no longer be able to finalize your contribution.
 Could you respond to this email and indicate if you wish to continue and fix 
test failures or step down and some committer may revert you commit. 

 *New test failure in master-nightly 
IgniteWalReaderTest.testIteratorWithCurrentKernelContext 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-5971651183868421794=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - maxim muzafarov  
https://ci.ignite.apache.org/viewModification.html?modId=931564
 - pavel pereslegin  
https://ci.ignite.apache.org/viewModification.html?modId=931553

 *New test failure in master 
IgniteWalReaderTest.testIteratorWithCurrentKernelContext 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=6086936352958536960=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - maxim muzafarov  
https://ci.ignite.apache.org/viewModification.html?modId=931564
 - pavel pereslegin  
https://ci.ignite.apache.org/viewModification.html?modId=931553

 - Here's a reminder of what contributors were agreed to do 
https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute 
 - Should you have any questions please contact dev@ignite.apache.org 

Best Regards,
Apache Ignite TeamCity Bot 
https://github.com/apache/ignite-teamcity-bot
Notification generated at 01:44:28 28-07-2021 


Re: Review Requested -- IGNITE-15077

2021-07-27 Thread Atri Sharma
Hi Ilya,


> Frankly speaking, I do not see the value of having an extra layer of
> indirection around *local* Quartz-based scheduler in Ignite. Can you
> elaborate?

I didnt quite understand that. Are you referring to the
IgniteCombinedSchedulerProcessor?
>
> Our guidelines also recommend having issue description to document the whys
> and hows, and not just issue title.

Sure, I will update the issue with more details.

-- 
Regards,

Atri
Apache Concerted


Re: Review Requested -- IGNITE-15077

2021-07-27 Thread Ilya Kasnacheev
Hello!

Frankly speaking, I do not see the value of having an extra layer of
indirection around *local* Quartz-based scheduler in Ignite. Can you
elaborate?

Our guidelines also recommend having issue description to document the whys
and hows, and not just issue title.

Regards,
-- 
Ilya Kasnacheev


вт, 27 июл. 2021 г. в 18:42, Atri Sharma :

> Hello,
>
> I have raised a PR for the said JIRA:
>
> https://github.com/apache/ignite/pull/9277
>
> TC is green with the PR:
>
>
> https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_BuildApacheIgnite/6105253
>
> Please help in reviewing.
>
> Regards,
>
> Atri
>


Re: Text Queries Support

2021-07-27 Thread Atri Sharma
Andrey,

> Per-partition Lucene index looks simple to implement, but it may require
> per-partition SQL to make full-text search expressions work correctly
> within the SQL quiery.
I think that as long as we follow the map - reduce process that we
already do for other queries, we should be fine.

> Per-partition SQL index may kill the performance. We already tried to do
> that in Ignite 2. However, QueryParallelism feature helps to speed up some
> data-intensive queries,
> but hits the performance in simple cases, and at some point (e.g. segments
> > number of CPU) the performance rapidly degrades with the increasing
> number of segments.

Yeah, that is always the case, but a global index will be a nightmare
in terms of concurrency and pessimistic concurrency control will
anyways kill the benefits, coupled with the metadata requirements.
What were the specific issues with per partition index?
>
> AFAIK, Lucene widely used bitmap indices that are easy to merge.
> Maybe, the map-reduce technique underneath FTS expressions and some hacks
> will add a minimal overhead.

Lucene uses many types of indices but the aspect here is that per
partition Lucene indices can return docIDs and we can merge them in
reduce phase. So we are abstracted out from specifics of the internal
index being used to serve the query.

>
> > As illustrated by Ilya, we can use Ignite's WAL records to rebuild
> > Lucene indices. The important thing here is to not treat Lucene
> > indices as source of truth.
> To use WAL we either should relay Lucene files to our Page memory or be
> aware of Lucene files structure.
> The first looks tricky, as we should guarantee a contiguous address space
> in Page memory for reflecting Lucene file. Maybe separate managed memory
> segment with its own rules?

Why not use Lucene's MMappedDirectory and map it to our storage classes?

>
> >> Transactions.
> >> * Will we support transactions?
> > Lucene has no concept of transactions.
> Yes, but we have.
> Lucene index may be non-transactional, but users never expect to see
> uncommited data.
> How does this connect with transactional SQL?
We could have the Lucene writes done as a part of transactions and ack
back only when it succeeds/fails. WDYT?
>
> On Tue, Jul 27, 2021 at 1:36 PM Atri Sharma  wrote:
>
> > Sorry, I planned on creating a Wiki page for this, but it makes more
> > sense to be replying here.
> >
> > > * How Lucene index can be split among the nodes?
> >
> > We can have partition level indices on each node.
> >
> > > * If we'll have a single index for all partitions on the particular node,
> > > then how index records will be aware of partitioning?
> >
> > Index records dont need to be aware of partitioning -- each Lucene
> > index is independent.
> >
> > > This is important to filter out backup records from the results to avoid
> > > duplicates.
> >
> > We can merge documents from different nodes and remove duplicates as
> > long as docIDs are globally unique.
> >
> > > * How results from several nodes can be merged on the Reduce stage?
> >
> > As long as documents have a globally unique docID, Lucene has merge
> > functions that can merge results from multiple partial results.
> >
> > > * Does Lucene supports smth like JOIN operation or others that may
> > require
> > > data from another partition or index?
> >
> > As illustrated by Ilya, Block-Join works for us.
> >
> > > If so, then it likes to multistep query with merging results on
> > > intermediate stages and requires detailed investigation and design.
> > > It is ok if Ignite will have some limitations here, but we would like to
> > > know about them at the early stage.
> >
> > > * How effectively map Lucene files to the page memory? Is it even
> > possible?
> >
> > Lucene has PageDirectory implementations which allow storing Lucene
> > indices on different kind of file structures. It has a
> > MMappedFileDirectory that we could use?
> >
> > > Otherwise, how to deal with potential OOM on large queries and memory
> > > capacity planning?
> >
> > We can use Lucene's MMapped directory.
> >
> > >
> > > Persistence.
> > > * How and what consistency guarantees could we have/expect?
> >
> > Lucene does not have WAL logs but is append only
> >
> > > Seems, we may not be able to write physical records for Lucene index to
> > our
> > > WAL. What can we do with this?
> >
> > As illustrated by Ilya, we can use Ignite's WAL records to rebuild
> > Lucene indices. The important thing here is to not treat Lucene
> > indices as source of truth.
> > >
> > > Transactions.
> > > * Will we support transactions?
> > Lucene has no concept of transactions.
> >
> > > * Should Lucene be aware of Transaction and track mvcc (or whatever)
> > > versions for the records?
> > No
> > > * What will be consistency guarantees?
> > We can acknowledge writes back only after Lucene index is updated.
> > >
> > > UX
> > > * How to add FullText search queries syntax into Calcite?
> > Postgres's FTS functions are a good 

Review Requested -- IGNITE-15077

2021-07-27 Thread Atri Sharma
Hello,

I have raised a PR for the said JIRA:

https://github.com/apache/ignite/pull/9277

TC is green with the PR:

https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_BuildApacheIgnite/6105253

Please help in reviewing.

Regards,

Atri


[ANNOUNCE] Apache IGNITE python thin client (pyignite) 0.5.1 released

2021-07-27 Thread Ivan Daschinsky
The Apache Ignite Community is pleased to announce the release of
Apache IGNITE python thin client (pyignite) 0.5.1.

This new release of python thin client contains bugfixes and a few new
features
Namely:
1. Event listeners to connection events and query events
2. Optional client's side handshake timeout
3. Logging of connection and queries events

For the full list of changes, you can refer to the RELEASE_NOTES.
https://ignite.apache.org/releases/pyignite/0.5.1/release_notes.html

You can install this version using pip
>> pip install pyignite==0.5.1

Alternatively, you can download sources and binary packages (wheels) from
here:
https://dist.apache.org/repos/dist/release/ignite/pyignite/0.5.1/

Please let us know if you have any problems
https://ignite.apache.org/community/resources.html#ask

Regards,
Ivan Daschinsky on behalf of the Apache Ignite community.


Re: [DISCUSSION] Documentation feedback button

2021-07-27 Thread Nikita Safonov
Hello Igniters,

I would like to proceed with implementing the feedback service for the
Ignite documentation website.
If nobody objects, let's start the work.
The details are described here:
https://issues.apache.org/jira/browse/IGNITE-15198

With best regards,
Nikita

ср, 17 мар. 2021 г. в 15:50, Mauricio Stekl :

> Hi,
> I agree with Nikita, it would be a very simple way of getting feedback to
> improve the documentation.
> This tool in particular is quite easy to integrate into the online docs
> template.
>
> Best,
> Mauricio
>
>
>
>
>
> On Tue, Mar 16, 2021 at 4:46 PM Denis Magda  wrote:
>
> > Nikita, thanks for starting the conversation. I'll just cast my vote for
> > the bugyard.io for its ability to select and comment on a specific
> > problematic point on a documentation page (a paragraph, sentence,
> picture,
> > code snippet, etc.). It makes it trivial to share feedback and then it's
> > easy to process it on the other side. Those who'd like to experience it,
> > click the "Documentation Feedback" button on the GridGain docs:
> > https://www.gridgain.com/docs/latest/getting-started/what-is-gridgain
> >
> > -
> > Denis
> >
> >
> > On Tue, Mar 16, 2021 at 3:14 PM Никита Сафонов <
> vlasovpavel2...@gmail.com>
> > wrote:
> >
> > > Hello Igniters,
> > >
> > > I would like to propose an enhancement that can significantly improve
> the
> > > quality of our documentation.
> > >
> > > I suggest adding a feedback button (let’s call it a «Documentation
> > > feedback» button) to the web site documentation pages so everyone could
> > > leave a comment to what is already published.
> > >
> > > The feedback may refer to the Ignite documentation in general or to a
> > > particular section, paragraph, or even term or value.
> > >
> > > I do believe that we would harvest dozens and hundreds of ideas,
> > > suggestions, and observations.
> > >
> > > I would also suggest using bugyard.io as a solid service for gathering
> > > feedback (I’m quite familiar with it, it is easy to implement, use, and
> > > maintain).
> > >
> > > So folks, what do you think of this?
> > >
> > > With best regards,
> > > Nikita
> > >
> >
>


Re: [ANNOUNCE] Apache Ignite spring-data-all-ext extensions 1.0.0 released

2021-07-27 Thread Denis Magda
Nikita, thanks for releasing the artifacts.

Could you please update the following documentation section as well?
https://ignite.apache.org/docs/latest/extensions-and-integrations/spring/spring-data#maven-configuration

This is what worked for me:

1. Ignite Spring Data dependency


   org.apache.ignite
   ignite-spring-data-2.2-ext
   1.0.0


2. Also, I had to provide this dependency to solve various runtime-related
issues (didn't need to do this with previous version of the integration):


org.springframework.data
spring-data-commons
   * 2.2.3.RELEASE (this is hardcoded, needs to be
generic in the docs)*


--
Denis


-
Denis

On Sat, Jul 24, 2021 at 7:46 PM Nikita Amelchev 
wrote:

> Hi,
>
> Artifacts were uploaded to the maven repo [1, 2, 3, 4].
>
> The spring-tx module was migrated in the 2.11 that was not released yet.
> The extension release is independent of the Ignite release.
>
> [1]
> https://mvnrepository.com/artifact/org.apache.ignite/ignite-spring-data-ext
> [2]
>
> https://mvnrepository.com/artifact/org.apache.ignite/ignite-spring-data-2.0-ext
> [3]
>
> https://mvnrepository.com/artifact/org.apache.ignite/ignite-spring-data-2.2-ext
> [4]
>
> https://mvnrepository.com/artifact/org.apache.ignite/ignite-spring-data-commons
>
>
> сб, 24 июл. 2021 г., 04:48 18624049226 <18624049...@163.com>:
>
> > Igniters,
> >
> > It seems that these artifact have not been uploaded to Maven Repo? Will
> > it be uploaded with ignite 2.11?
> >
> > This release does not include spring-tx-ext, but the documentation of
> > this artifact has been merged into the ignite-2.11 branch, What's wrong?
> >
> > 在 2021/7/7 下午1:49, Nikita Amelchev 写道:
> > > The Apache Ignite Community is pleased to announce the release of
> > > Apache Ignite Spring Data extensions 1.0.0.
> > >
> > > The following integrations were migrated to the Apache Ignite
> > > Extensions repository:
> > >
> > > - Spring Data 2.2 extension.
> > > - Spring Data 2.0 extension.
> > > - Spring Data extension.
> > >
> > > Release notes:
> > >
> >
> https://ignite.apache.org/releases/ext/spring-data-all-1.0.0/release_notes.html
> > >
> > > The sources package is available here:
> > >
> >
> https://downloads.apache.org/ignite/ignite-extensions/ignite-spring-data-all-ext/1.0.0/
> > >
> > > Please let us know if you have any problems
> > > https://ignite.apache.org/community/resources.html#ask
> > >
> >
>


[RESULT] [VOTE] Release pyignite 0.5.1-rc0

2021-07-27 Thread Ivan Daschinsky
Hello, Igniters!

Release pyignite-0.5.1-rc0 has been accepted.

The votes received:
3 "+1" binding votes
1 "+1" votes

There are no "+0" or "-1"

Here the votes received:
- Igor Sapego (binding) +1
- Pavel Tupitsyn (binding) +1
- Nickolay Izhikov (binding) +1
- Ivan Daschinsky +1


Re: Text Queries Support

2021-07-27 Thread Andrey Mashenkov
Atri,

> We can have partition level indices on each node.
Per-partition Lucene index looks simple to implement, but it may require
per-partition SQL to make full-text search expressions work correctly
within the SQL quiery.
Per-partition SQL index may kill the performance. We already tried to do
that in Ignite 2. However, QueryParallelism feature helps to speed up some
data-intensive queries,
but hits the performance in simple cases, and at some point (e.g. segments
> number of CPU) the performance rapidly degrades with the increasing
number of segments.

AFAIK, Lucene widely used bitmap indices that are easy to merge.
Maybe, the map-reduce technique underneath FTS expressions and some hacks
will add a minimal overhead.


> We can use Lucene's MMapped directory.
> Lucene does not have WAL logs but is append only.
Yes.
Also, per-partition index will require at least 1 (or more?) files. OS
usually has tight limits for opened file descriptors numbers.
'Append only' sounds good, that might be helpful.


> As illustrated by Ilya, we can use Ignite's WAL records to rebuild
> Lucene indices. The important thing here is to not treat Lucene
> indices as source of truth.
To use WAL we either should relay Lucene files to our Page memory or be
aware of Lucene files structure.
The first looks tricky, as we should guarantee a contiguous address space
in Page memory for reflecting Lucene file. Maybe separate managed memory
segment with its own rules?
The second looks like overkill.

>> Transactions.
>> * Will we support transactions?
> Lucene has no concept of transactions.
Yes, but we have.
Lucene index may be non-transactional, but users never expect to see
uncommited data.
How does this connect with transactional SQL?



On Tue, Jul 27, 2021 at 1:36 PM Atri Sharma  wrote:

> Sorry, I planned on creating a Wiki page for this, but it makes more
> sense to be replying here.
>
> > * How Lucene index can be split among the nodes?
>
> We can have partition level indices on each node.
>
> > * If we'll have a single index for all partitions on the particular node,
> > then how index records will be aware of partitioning?
>
> Index records dont need to be aware of partitioning -- each Lucene
> index is independent.
>
> > This is important to filter out backup records from the results to avoid
> > duplicates.
>
> We can merge documents from different nodes and remove duplicates as
> long as docIDs are globally unique.
>
> > * How results from several nodes can be merged on the Reduce stage?
>
> As long as documents have a globally unique docID, Lucene has merge
> functions that can merge results from multiple partial results.
>
> > * Does Lucene supports smth like JOIN operation or others that may
> require
> > data from another partition or index?
>
> As illustrated by Ilya, Block-Join works for us.
>
> > If so, then it likes to multistep query with merging results on
> > intermediate stages and requires detailed investigation and design.
> > It is ok if Ignite will have some limitations here, but we would like to
> > know about them at the early stage.
>
> > * How effectively map Lucene files to the page memory? Is it even
> possible?
>
> Lucene has PageDirectory implementations which allow storing Lucene
> indices on different kind of file structures. It has a
> MMappedFileDirectory that we could use?
>
> > Otherwise, how to deal with potential OOM on large queries and memory
> > capacity planning?
>
> We can use Lucene's MMapped directory.
>
> >
> > Persistence.
> > * How and what consistency guarantees could we have/expect?
>
> Lucene does not have WAL logs but is append only
>
> > Seems, we may not be able to write physical records for Lucene index to
> our
> > WAL. What can we do with this?
>
> As illustrated by Ilya, we can use Ignite's WAL records to rebuild
> Lucene indices. The important thing here is to not treat Lucene
> indices as source of truth.
> >
> > Transactions.
> > * Will we support transactions?
> Lucene has no concept of transactions.
>
> > * Should Lucene be aware of Transaction and track mvcc (or whatever)
> > versions for the records?
> No
> > * What will be consistency guarantees?
> We can acknowledge writes back only after Lucene index is updated.
> >
> > UX
> > * How to add FullText search queries syntax into Calcite?
> Postgres's FTS functions are a good reference.
> > * AFAIK, the Lucene index has many properties for tuning. How will the
> user
> > configure the index?
> Most of those properties can be cluster level and exposed as a new sub
> config for ignite.
> > * How and where to store the settings? What are cluster-wide and what a
> > local to the particular node?
> All can be cluster level.
> > * Will be all the settings immutable? Can be they changed on-fly? after
> > node/grid restart?
> They should be applied post restart.
>
> > * Any limitations on query syntax?
> It depends on how we model our queries for text search.
>
> >
> > SQL
> > * Will we support FullText search in SQL?
> We 

Re: Text Queries Support

2021-07-27 Thread Atri Sharma
Sorry, I planned on creating a Wiki page for this, but it makes more
sense to be replying here.

> * How Lucene index can be split among the nodes?

We can have partition level indices on each node.

> * If we'll have a single index for all partitions on the particular node,
> then how index records will be aware of partitioning?

Index records dont need to be aware of partitioning -- each Lucene
index is independent.

> This is important to filter out backup records from the results to avoid
> duplicates.

We can merge documents from different nodes and remove duplicates as
long as docIDs are globally unique.

> * How results from several nodes can be merged on the Reduce stage?

As long as documents have a globally unique docID, Lucene has merge
functions that can merge results from multiple partial results.

> * Does Lucene supports smth like JOIN operation or others that may require
> data from another partition or index?

As illustrated by Ilya, Block-Join works for us.

> If so, then it likes to multistep query with merging results on
> intermediate stages and requires detailed investigation and design.
> It is ok if Ignite will have some limitations here, but we would like to
> know about them at the early stage.

> * How effectively map Lucene files to the page memory? Is it even possible?

Lucene has PageDirectory implementations which allow storing Lucene
indices on different kind of file structures. It has a
MMappedFileDirectory that we could use?

> Otherwise, how to deal with potential OOM on large queries and memory
> capacity planning?

We can use Lucene's MMapped directory.

>
> Persistence.
> * How and what consistency guarantees could we have/expect?

Lucene does not have WAL logs but is append only

> Seems, we may not be able to write physical records for Lucene index to our
> WAL. What can we do with this?

As illustrated by Ilya, we can use Ignite's WAL records to rebuild
Lucene indices. The important thing here is to not treat Lucene
indices as source of truth.
>
> Transactions.
> * Will we support transactions?
Lucene has no concept of transactions.

> * Should Lucene be aware of Transaction and track mvcc (or whatever)
> versions for the records?
No
> * What will be consistency guarantees?
We can acknowledge writes back only after Lucene index is updated.
>
> UX
> * How to add FullText search queries syntax into Calcite?
Postgres's FTS functions are a good reference.
> * AFAIK, the Lucene index has many properties for tuning. How will the user
> configure the index?
Most of those properties can be cluster level and exposed as a new sub
config for ignite.
> * How and where to store the settings? What are cluster-wide and what a
> local to the particular node?
All can be cluster level.
> * Will be all the settings immutable? Can be they changed on-fly? after
> node/grid restart?
They should be applied post restart.

> * Any limitations on query syntax?
It depends on how we model our queries for text search.

>
> SQL
> * Will we support FullText search in SQL?
We need custom functions for it. See Postgres's FTS functions.
> * How to integrate Lucene index into Calcite? What is the cost model?
There cannot be any cost model since there are no paths for a text
query. If we see a text query, we have to use Lucene index or return
an error. In this way, we need to model text search as a set of UDFs

> Splitting rules? Traits?
Please see my reply above.
>
>
> With all of this, you can go with the IEP (or even some short summary) and
> further POC and implementation.
> That's a big deal, so let's discuss what could be done here.
>
> On Fri, Jul 23, 2021 at 12:58 PM Atri Sharma  wrote:
>
> > I am actually happy to drive the feature for Ignite 3. FTS is very
> > important for me and I think Ignite users will benefit from it
> > greatly.
> >
> > If it makes sense to be focusing on Ignite 3 for this capability, I am
> > eager to contribute there and lead the development.
> >
> > Please share your thoughts.
> >
> > On Fri, Jul 23, 2021 at 3:21 PM Andrey Mashenkov
> >  wrote:
> > >
> > > Hi Atri,
> > >
> > > All the Jira tickets we have on the Full-text search (FTS) thing are
> > > targeted to Ignite 2.
> > >
> > > AFAIK, we want, but we have NOT committed to FTS support in Ignite 3,
> > yet.
> > > By the way, we are getting requests for this thing from the user side,
> > and
> > > definitely,
> > > FTS would be a valuable feature for Ignite.
> > >
> > > It will be great if the one wants to drive it, any help will be
> > appreciated.
> > >
> > >
> > > On Fri, Jul 23, 2021 at 12:12 PM Atri Sharma  wrote:
> > >
> > > > Hello,
> > > >
> > > > An update, please. I am working through persistence of Lucene index
> > using
> > > > Ignite Dictionary, and will be asking some questions soon.
> > > >
> > > > I had one doubt - - where does this change go? Ignite 3?
> > > >
> > > > Also, I know we want to build native support for text searches in
> > Ignite 3.
> > > > Is the work I am proposing here part of 

Re: Text Queries Support

2021-07-27 Thread Ilya Kasnacheev
Hello!

Let me try to answer the questions below, since I did not see anybody do
that and thus not everybody may be on the same page.

Regards,

пт, 23 июл. 2021 г. в 13:56, Andrey Mashenkov :

> Atri,
>
> As for now, the potential capabilities are not clear to me.
> At first glance, I see the next topics that must be covered at first:
>
> General questions
> * How Lucene index can be split among the nodes?
>
In the same fashion as SQL indexes - each node might only hold index for
its primary partitions.


> * If we'll have a single index for all partitions on the particular node,
> then how index records will be aware of partitioning?
>
I'm not sure, how does our SQL deal with it? If there is scenario where
some keys are no longer primary, we can perhaps filter them out and in the
meantime exclude from index.


> This is important to filter out backup records from the results to avoid
> duplicates.
> * How results from several nodes can be merged on the Reduce stage?
>
It is actually the primary use case for Lucene/Solr, usually they are
merged by relevance/score.


> * Does Lucene supports smth like JOIN operation or others that may require
> data from another partition or index?
> If so, then it likes to multistep query with merging results on
> intermediate stages and requires detailed investigation and design.
> It is ok if Ignite will have some limitations here, but we would like to
> know about them at the early stage.
>
Lucene has block-join which allows it to near store related data. Lucene
also has regular join, but I don't see any use case for it since we can do
SQL join as well.



> * How effectively map Lucene files to the page memory? Is it even possible?
> Otherwise, how to deal with potential OOM on large queries and memory
> capacity planning?
>
I think it's pretty good here, it's the must for information retrieval
since there's usually a lot of it.


>
> Persistence.
> * How and what consistency guarantees could we have/expect?
> Seems, we may not be able to write physical records for Lucene index to our
> WAL. What can we do with this?
>
I think we should be able to do it in the same fashion as we do it with SQL
indexes, during WAL recovery, also update the Lucene index. On clear
shutdown, assume that it is okay. If Lucene index is removed then do a full
rebuild, like we do it with index.bin.


>
> Transactions.
> * Will we support transactions?
> * Should Lucene be aware of Transaction and track mvcc (or whatever)
> versions for the records?
> * What will be consistency guarantees?
>
I think the answer here is NO. Text search is not expected to be
transactionally up-to-date. It is expected to be eventually full. So it is
OK if it takes a split-second for entries to become searchable.

The traditional way to update text indexes is batching.


>
> UX
> * How to add FullText search queries syntax into Calcite?
> * AFAIK, the Lucene index has many properties for tuning. How will the user
> configure the index?
> * How and where to store the settings? What are cluster-wide and what a
> local to the particular node?
> * Will be all the settings immutable? Can be they changed on-fly? after
> node/grid restart?
> * Any limitations on query syntax?
>
Solr and Elasticsearch spent a lot of time on this, and the field is huge
here. They have really extensive query language. On the bright side, most
of the "settings" are dynamic and cached, so if you need a different
filtering of your data all you need is to request it once. Ones which
aren't usually concern how data is prepared before being put into index
(stemming, tokenizing, etc), changing it will require index rebuild. I
don't think why settings will not be shared along cluster.


> SQL
> * Will we support FullText search in SQL?
> * How to integrate Lucene index into Calcite? What is the cost model?
> Splitting rules? Traits?
> * What about consistency with DDL operations, e.g. column rename?
> Ignite indices will operate column ID, so rename operation will not affect
> the index.
>

Regards,

-- 



>
>
> With all of this, you can go with the IEP (or even some short summary) and
> further POC and implementation.
> That's a big deal, so let's discuss what could be done here.
>
> On Fri, Jul 23, 2021 at 12:58 PM Atri Sharma  wrote:
>
> > I am actually happy to drive the feature for Ignite 3. FTS is very
> > important for me and I think Ignite users will benefit from it
> > greatly.
> >
> > If it makes sense to be focusing on Ignite 3 for this capability, I am
> > eager to contribute there and lead the development.
> >
> > Please share your thoughts.
> >
> > On Fri, Jul 23, 2021 at 3:21 PM Andrey Mashenkov
> >  wrote:
> > >
> > > Hi Atri,
> > >
> > > All the Jira tickets we have on the Full-text search (FTS) thing are
> > > targeted to Ignite 2.
> > >
> > > AFAIK, we want, but we have NOT committed to FTS support in Ignite 3,
> > yet.
> > > By the way, we are getting requests for this thing from the user side,
> > and
> > > definitely,
> > > 

Re: [VOTE] Release pyignite 0.5.1-rc0

2021-07-27 Thread Nikolay Izhikov
+1 (binding)

> 26 июля 2021 г., в 21:11, Pavel Tupitsyn  написал(а):
> 
> +1
> 
> On Mon, Jul 26, 2021 at 11:36 AM Igor Sapego  wrote:
> 
>> +1 from me
>> 
>> Best Regards,
>> Igor
>> 
>> 
>> On Fri, Jul 23, 2021 at 3:32 PM Ivan Daschinsky 
>> wrote:
>> 
>>> +1 From me
>>> 1. Checked binary packages, c module and examples on windows 10 amd64 for
>>> pythons 3.6, 3.7, 3.8, 3.9
>>> 2. Checked binary packages, c module and examples on ubuntu 20.04 amd64
>> for
>>> pythons 3.6, 3.7, 3.8, 3.9
>>> 3. Checked source installation and building binary packages on ubuntu
>> 20.04
>>> amd 64 for pythons 3.6, 3.7, 3.8, 3.9
>>> 4. Checked documentation on
>>> https://apache-ignite-binary-protocol-client.readthedocs.io/en/0.5.1.rc0
>>> 5. Checked sha512 checksums and gpg signatures (signed by Igor Sapego
>> (CODE
>>> SIGNING KEY)  5C10 A072 2D94 7727 923C  98B5 AF35
>> DBD9
>>> 58FE 8DC5)
>>> key is inside https://downloads.apache.org/ignite/KEYS)
>>> 
>>> пт, 23 июл. 2021 г. в 13:52, Ivan Daschinsky :
>>> 
 The voting finishes at 07/27/2021 12:00 UTC
 
 пт, 23 июл. 2021 г. в 13:49, Ivan Daschinsky :
 
> Dear Igniters!
> 
> Release candidate binaries for subj are uploaded and ready for vote
> You can find them here:
> https://dist.apache.org/repos/dist/dev/ignite/pyignite/0.5.1-rc0
> 
> If you follow the link above, you will find source packages (*.tar.gz
>>> and
> *.zip)
> and binary packages (wheels) for windows (amd64) and linux (x86_64)
> for pythons 36, 37, 38, 39. Also, there are sha512 and gpg signatures.
> Code signing keys can be found here --
> https://downloads.apache.org/ignite/KEYS
> Here you can find instructions how to verify packages
> https://www.apache.org/info/verification.html
> 
> You can install binary package for specific version of python using
>> pip
> For example do this on linux for python 3.8
>>> pip install pyignite-0.5.1-cp38-cp38-manylinux1_x86_64.whl
> 
> You can build and install package from source using this command:
>>> pip install pyignite-0.5.1.tar.gz
> You can build wheel on your platform using this command:
>>> pip wheel --no-deps pyignite-0.5.1.tar.gz
> 
> For building C module, you should have python headers and C compiler
> installed.
> (i.e. for ubuntu sudo apt install build-essential python3-dev)
> In Mac OS X xcode-tools and python from homebrew are the best option.
> 
> In order to check whether C module works, use following:
>>> from pyignite import _cutils
>>> print(_cutils.hashcode('test'))
>>> 3556498
> 
> You can find documentation here:
> 
>>> https://apache-ignite-binary-protocol-client.readthedocs.io/en/0.5.1.rc0
> 
> You can find examples here (to check them, you should start ignite
> locally):
> 
> 
>>> 
>> https://apache-ignite-binary-protocol-client.readthedocs.io/en/0.5.1.rc0/examples.html
> Also, examples can be found in source archive in examples subfolder.
> docker-compose.yml is supplied in order to start ignite quickly. (Use
> `docker-compose up -d` to start 3 nodes cluster and `docker-compose
> down` to shut down it)
> 
> Release notes:
> 
> 
>>> 
>> https://gitbox.apache.org/repos/asf?p=ignite-python-thin-client.git;a=blob;f=RELEASE_NOTES.txt;h=c6cbd419684cd4a97485707471bac84957b42891;hb=b48dd5dec37064b458031358c394789d15a756fc
> 
> Git release tag was created:
> 
> 
>>> 
>> https://gitbox.apache.org/repos/asf?p=ignite-python-thin-client.git;a=tag;h=refs/tags/0.5.1.rc0
> 
> The vote is formal, see voting guidelines
> https://www.apache.org/foundation/voting.html
> 
> +1 - to accept pyignite-0.5.1-rc0
> 0 - don't care either way
> -1 - DO NOT accept pyignite-0.5.1-rc0
> 
> 
 
 --
 Sincerely yours, Ivan Daschinskiy
 
>>> 
>>