[jira] [Created] (IGNITE-10937) Support data page scan for JDBC/ODBC

2019-01-14 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10937:
---

 Summary: Support data page scan for JDBC/ODBC
 Key: IGNITE-10937
 URL: https://issues.apache.org/jira/browse/IGNITE-10937
 Project: Ignite
  Issue Type: Improvement
Reporter: Sergi Vladykin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10798) Data page scan for ScanQuery, SqlQuery and SqlFieldsQuery

2018-12-24 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10798:
---

 Summary: Data page scan for ScanQuery, SqlQuery and SqlFieldsQuery
 Key: IGNITE-10798
 URL: https://issues.apache.org/jira/browse/IGNITE-10798
 Project: Ignite
  Issue Type: Improvement
Reporter: Sergi Vladykin
Assignee: Sergi Vladykin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10431) Make tests independent of memory size

2018-11-27 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10431:
---

 Summary: Make tests independent of memory size
 Key: IGNITE-10431
 URL: https://issues.apache.org/jira/browse/IGNITE-10431
 Project: Ignite
  Issue Type: Bug
Reporter: Sergi Vladykin


The following tests were added to Page Compression suite and they fail because 
page size there increased to 8k.
{code:java}
org.apache.ignite.testsuites.IgnitePdsCompressionTestSuite2…g.apache.ignite.internal.processors.cache.persistence.db.wal
 (4)
 IgniteWALTailIsReachedDuringIterationOverArchiveTest.testStandAloneIterator    
 IgniteWalFormatFileFailoverTest.testFailureHandlerTriggered    
 IgniteWalFormatFileFailoverTest.testFailureHandlerTriggeredFsync   
 IgniteWalIteratorExceptionDuringReadTest.test  
org.apache.ignite.testsuites.IgnitePdsCompressionTestSuite2…e.ignite.internal.processors.cache.persistence.db.wal.reader
 (9)
 IgniteWalReaderTest.testCheckBoundsIterator    
 IgniteWalReaderTest.testFillWalAndReadRecords  
 IgniteWalReaderTest.testFillWalForExactSegmentsCount   
 IgniteWalReaderTest.testFillWalWithDifferentTypes  
 IgniteWalReaderTest.testPutAllTxIntoTwoNodes   
 IgniteWalReaderTest.testRemoveOperationPresentedForDataEntry   
 IgniteWalReaderTest.testRemoveOperationPresentedForDataEntryForAtomic  
 IgniteWalReaderTest.testTxFillWalAndExtractDataRecords 
 IgniteWalReaderTest.testTxRecordsReadWoBinaryMeta  
org.apache.ignite.testsuites.IgnitePdsCompressionTestSuite2…ache.ignite.internal.processors.cache.persistence.wal.reader
 (2)
 StandaloneWalRecordsIteratorTest.testCorrectClosingFileDescriptors 
 StandaloneWalRecordsIteratorTest.testStrictBounds 
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [IMPORTANT] Future of Binary Objects

2018-11-22 Thread Sergi Vladykin
If we are developing a product for users, we already guessing what is right
and what is wrong for them. So let's avoid these sophistic statements.

In the end it is always our responsibility to provide a balanced set of
trade-offs between
usability, performance and safety.

Let me repeat, I'm not against any possible type conversions, but I'm
strongly against binary incompatible ones.
If we always store List.of(1) as 1 and make them binary interchangeable,
I'm OK with that.

And still for good practices I'd suggest to look at what Protobuf allows
and what not:
https://developers.google.com/protocol-buffers/docs/proto3#updating

Sergi

чт, 22 нояб. 2018 г. в 11:04, Vladimir Ozerov :

> Sergi,
>
> I think we should not guess for users what is right or wrong for them. It
> is up to user to decide what is valid. For example, consider a user who
> operates on a list of Integers, and to optimize memory consumption he
> decide to save in the same field either List, or plain Integer in
> case only single element exists. Another example - a kind of data lake or
> data cleansing application, which may receive the same field in different
> forms. E.g. age in the form of Integer or String. Does it work for user or
> not? We do not know. Will he need to migrate the whole data set? We do not
> know either.
>
> The only place in the product where we case is SQL. But in this case
> instead of adding checks on binary level, we should validate data on cache
> level. In fact, Ignite already works this way. E.g. nullability checks are
> performed on cache level rather than binary. All we need is to move all
> checks to cache level from binary level.
>
>
> On Thu, Nov 22, 2018 at 9:41 AM Sergi Vladykin 
> wrote:
>
> > It may be OK to extend compatible field types (like from Int to Long).
> >
> > In Protobuf for example this is allowed just because there is no
> difference
> > between Int and Long in binary format: they all are equally varlen
> encoded
> > and Longs just will occupy up to 9 bytes, while Ints up to 5.
> >
> > But for every other case, where binary representation is type dependent,
> I
> > would be against. This will either require to migrate the whole dataset
> to
> > a new model (which is always risky, since you may need to rollback to
> > previous version of your code) or it will require type checks/conversions
> > for each field access, which is a hard to reason complication and
> possible
> > performance penalty.
> >
> > Sergi
> >
> >
> >
> > чт, 22 нояб. 2018 г. в 09:23, Vladimir Ozerov :
> >
> > > Denis,
> > >
> > > Several examples:
> > > 1) DEFAULT values - in SQL you may avoid storing default value in the
> > table
> > > and store it in metadata instead. Not applicable for BinaryObject
> because
> > > the same binary object may be saved to two SQL tables with different
> > > defaults
> > > 2) DATE and other temporal types - in SQL you want to store it in
> special
> > > format to be able to extract date parts quickly (typically - 11 bytes).
> > But
> > > in Java and some other languages the best format is plain long. this is
> > why
> > > we use it BinaryObject
> > > 3) String charset - in SQL you may choose different charsets for
> > different
> > > tables. E.g. UTF-8 for one, ASCII for another. In BinaryObject we store
> > > everything in UTF-8, and this is fine for most cases, well ... except
> of
> > > SQL :-)
> > >
> > > The key thing here is that you cannot define a format which will be
> good
> > > for both SQL, and native API. They are very different. This is why I
> > > propose to define additional interface on cache level defining how to
> > store
> > > values, which will be very different from binary objects.
> > >
> > > Vladimir.
> > >
> > > On Thu, Nov 22, 2018 at 3:32 AM Denis Magda  wrote:
> > >
> > > > Vladimir,
> > > >
> > > > Could you educate me a little bit, why the current format is bad for
> > SQL
> > > > and why another one is more suitable?
> > > >
> > > > Also, if we introduce the new format then why would we keep the
> binary
> > > one?
> > > > Is the new format just a next version of the binary one.
> > > >
> > > > 2.3) Remove restrictions on changing field type
> > > > > I do not know why we did that in the first place. This restriction
> > > > prevents
> > > > > type evolution and confuses users.
> > > >
> > > >
> > > &

Re: [IMPORTANT] Future of Binary Objects

2018-11-21 Thread Sergi Vladykin
It may be OK to extend compatible field types (like from Int to Long).

In Protobuf for example this is allowed just because there is no difference
between Int and Long in binary format: they all are equally varlen encoded
and Longs just will occupy up to 9 bytes, while Ints up to 5.

But for every other case, where binary representation is type dependent, I
would be against. This will either require to migrate the whole dataset to
a new model (which is always risky, since you may need to rollback to
previous version of your code) or it will require type checks/conversions
for each field access, which is a hard to reason complication and possible
performance penalty.

Sergi



чт, 22 нояб. 2018 г. в 09:23, Vladimir Ozerov :

> Denis,
>
> Several examples:
> 1) DEFAULT values - in SQL you may avoid storing default value in the table
> and store it in metadata instead. Not applicable for BinaryObject because
> the same binary object may be saved to two SQL tables with different
> defaults
> 2) DATE and other temporal types - in SQL you want to store it in special
> format to be able to extract date parts quickly (typically - 11 bytes). But
> in Java and some other languages the best format is plain long. this is why
> we use it BinaryObject
> 3) String charset - in SQL you may choose different charsets for different
> tables. E.g. UTF-8 for one, ASCII for another. In BinaryObject we store
> everything in UTF-8, and this is fine for most cases, well ... except of
> SQL :-)
>
> The key thing here is that you cannot define a format which will be good
> for both SQL, and native API. They are very different. This is why I
> propose to define additional interface on cache level defining how to store
> values, which will be very different from binary objects.
>
> Vladimir.
>
> On Thu, Nov 22, 2018 at 3:32 AM Denis Magda  wrote:
>
> > Vladimir,
> >
> > Could you educate me a little bit, why the current format is bad for SQL
> > and why another one is more suitable?
> >
> > Also, if we introduce the new format then why would we keep the binary
> one?
> > Is the new format just a next version of the binary one.
> >
> > 2.3) Remove restrictions on changing field type
> > > I do not know why we did that in the first place. This restriction
> > prevents
> > > type evolution and confuses users.
> >
> >
> > That is a hot requirement shared by those who use Ignite SQL in
> production.
> > +1.
> >
> > --
> > Denis
> >
> > On Mon, Nov 19, 2018 at 11:05 PM Vladimir Ozerov 
> > wrote:
> >
> > > Igniters,
> > >
> > > It is very likely that Apache Ignite 3.0 will be released next year. So
> > we
> > > need to start thinking about major product improvements. I'd like to
> > start
> > > with binary objects.
> > >
> > > Currently they are one of the main limiting factors for the product.
> They
> > > are fat - 30+ bytes overhead on average, high TCO of Apache Ignite
> > > comparing to other vendors. They are slow - not suitable for SQL at
> all.
> > >
> > > I would like to ask all of you who worked with binary objects to share
> > your
> > > feedback and ideas, so that we understand how they should look like in
> AI
> > > 3.0. This is a brain storm - let's accumulate ideas first and minimize
> > > critics. Then we will work on ideas in separate topics.
> > >
> > > 1) Historical background
> > >
> > > BO were implemented around 2014 (Apache Ignite 1.5) when we started
> > working
> > > on .NET and CPP clients. During design we had several ideas in mind:
> > > - ability to read object fields in O(1) without deserialization
> > > - interoperabillty between Java, .NET and CPP.
> > >
> > > Since then a number of other concepts were mixed to the cocktail:
> > > - Affinity key fields
> > > - Strict typing for existing fields (aka metadata)
> > > - Binary Object as storage format
> > >
> > > 2) My proposals
> > >
> > > 2.1) Introduce "Data Row Format" interface
> > > Binary Objects are terrible candidates for storage. Too fat, too slow.
> > > Efficient storage typically has <10 bytes overhead per row (no
> metadata,
> > no
> > > length, no hash code, etc), allow supper-fast field access, support
> > > different string formats (ASCII, UTF-8, etc), support different
> temporal
> > > types (date, time, timestamp, timestamp with timezone, etc), and store
> > > these types as efficiently as possible.
> > >
> > > What we need is to introduce an interface which will convert a pair of
> > > key-value objects into a row. This row will be used to store data and
> to
> > > get fields from it. Care about memory consumption, need SQL and strict
> > > schema - use one format. Need flexibility and prefer key-value access -
> > use
> > > another format which will store binary objects unchanged (current
> > > behavior).
> > >
> > > interface DataRowFormat {
> > > DataRow create(Object key, Object value); // primitives or binary
> > > objects
> > > DataRowMetadata metadata();
> > > }
> > >
> > > 2.2) Remove affinity field from metadata
> > > 

Re: [IMPORTANT] Future of Binary Objects

2018-11-20 Thread Sergi Vladykin
I really like Protobuf format. It is probably not what we need for O(1)
fields access,
but for compact data representation we can derive lots from there.

Also IMO, restricting field type change is absolutely sane idea.
The correct way to evolve schema in common case is to add new fields and
gradually
deprecate the old ones, if you can skip default/null fields in binary
format this approach
will not introduce any noticeable performance/size overhead.

Sergi

вт, 20 нояб. 2018 г. в 11:12, Vyacheslav Daradur :

> I think, one of a possible way to reduce overhead and TCO - SQL Scheme
> approach.
>
> That assumes that metadata will be stored separately from serialized
> data to reduce size.
> In this case, the most advantages of Binary Objects like access in
> O(1) and access without deserialization may be achieved.
> On Tue, Nov 20, 2018 at 10:56 AM Vladimir Ozerov 
> wrote:
> >
> > Hi Alexey,
> >
> > Binary Objects only.
> >
> > On Tue, Nov 20, 2018 at 10:50 AM Alexey Zinoviev  >
> > wrote:
> >
> > > Do we discuss here Core features only or the roadmap for all
> components?
> > >
> > > вт, 20 нояб. 2018 г. в 10:05, Vladimir Ozerov :
> > >
> > > > Igniters,
> > > >
> > > > It is very likely that Apache Ignite 3.0 will be released next year.
> So
> > > we
> > > > need to start thinking about major product improvements. I'd like to
> > > start
> > > > with binary objects.
> > > >
> > > > Currently they are one of the main limiting factors for the product.
> They
> > > > are fat - 30+ bytes overhead on average, high TCO of Apache Ignite
> > > > comparing to other vendors. They are slow - not suitable for SQL at
> all.
> > > >
> > > > I would like to ask all of you who worked with binary objects to
> share
> > > your
> > > > feedback and ideas, so that we understand how they should look like
> in AI
> > > > 3.0. This is a brain storm - let's accumulate ideas first and
> minimize
> > > > critics. Then we will work on ideas in separate topics.
> > > >
> > > > 1) Historical background
> > > >
> > > > BO were implemented around 2014 (Apache Ignite 1.5) when we started
> > > working
> > > > on .NET and CPP clients. During design we had several ideas in mind:
> > > > - ability to read object fields in O(1) without deserialization
> > > > - interoperabillty between Java, .NET and CPP.
> > > >
> > > > Since then a number of other concepts were mixed to the cocktail:
> > > > - Affinity key fields
> > > > - Strict typing for existing fields (aka metadata)
> > > > - Binary Object as storage format
> > > >
> > > > 2) My proposals
> > > >
> > > > 2.1) Introduce "Data Row Format" interface
> > > > Binary Objects are terrible candidates for storage. Too fat, too
> slow.
> > > > Efficient storage typically has <10 bytes overhead per row (no
> metadata,
> > > no
> > > > length, no hash code, etc), allow supper-fast field access, support
> > > > different string formats (ASCII, UTF-8, etc), support different
> temporal
> > > > types (date, time, timestamp, timestamp with timezone, etc), and
> store
> > > > these types as efficiently as possible.
> > > >
> > > > What we need is to introduce an interface which will convert a pair
> of
> > > > key-value objects into a row. This row will be used to store data
> and to
> > > > get fields from it. Care about memory consumption, need SQL and
> strict
> > > > schema - use one format. Need flexibility and prefer key-value
> access -
> > > use
> > > > another format which will store binary objects unchanged (current
> > > > behavior).
> > > >
> > > > interface DataRowFormat {
> > > > DataRow create(Object key, Object value); // primitives or binary
> > > > objects
> > > > DataRowMetadata metadata();
> > > > }
> > > >
> > > > 2.2) Remove affinity field from metadata
> > > > Affinity rules are governed by cache, not type. We should remove
> > > > "affintiyFieldName" from metadata.
> > > >
> > > > 2.3) Remove restrictions on changing field type
> > > > I do not know why we did that in the first place. This restriction
> > > prevents
> > > > type evolution and confuses users.
> > > >
> > > > 2.4) Use bitmaps for "null" and default values and for fixed-length
> > > fields,
> > > > put fixed-length fields before variable-length.
> > > > Motivation: to save space.
> > > >
> > > > What else? Please share your ideas.
> > > >
> > > > Vladimir.
> > > >
> > >
>
>
>
> --
> Best Regards, Vyacheslav D.
>


[jira] [Created] (IGNITE-10338) Add Disk page compression test suites to TC

2018-11-19 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10338:
---

 Summary: Add Disk page compression test suites to TC
 Key: IGNITE-10338
 URL: https://issues.apache.org/jira/browse/IGNITE-10338
 Project: Ignite
  Issue Type: Test
Reporter: Sergi Vladykin
Assignee: Peter Ivanov


There are 2 suites in the module ignite-compress:

org.apache.ignite.testsuites.IgnitePdsCompressionTestSuite

org.apache.ignite.testsuites.IgnitePdsCompressionTestSuite2

 

They should be executed the same way as Direct-IO suites only on Linux agents.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Disk page compression for Ignite persistent store

2018-11-19 Thread Sergi Vladykin
Denis,

See inline.


пн, 19 нояб. 2018 г. в 20:17, Denis Magda :

> Hi Sergi,
>
> Didn't know you were cooking this dish in the background ) Excellent.  Just
> to be sure, that's part of this IEP, right?
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-20%3A+Data+Compression+in+Ignite#IEP-20:DataCompressioninIgnite-Withoutin-memorycompression


Correct.


>
>
> Since we can release only full file system blocks which are typically 4k
> > size, user must configure page size to be at least multiple FS blocks,
> e.g.
> > 8k or 16k. It also means that max compression ratio here is fsBlockSize /
> > pageSize = 4k / 16k = 0.25
>
>
> How to we handle the case if the page size is not a multiple of 4K? What is
> the most optimal page size if the user wants to get the best compression?
> Probably, we can adjust the default page size automatically if it's a clean
> deployment.
>
>
We already force page size to be between 1k and 16k and to be power of 2.
Thus there are only 2 options greater than 4k: either 8k or 16k. So page
must be just large enough.

Obviously the greater page size, the better compression you have, but
having very large pages may affect performance badly. Thus 8k with ratio
0.5 or 16k with ratio 0.25 must be OK for the most of cases.



> There will be 2 new properties on CacheConfiguration
> > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> page
> > compression.
>
>
> How about setting it at DataRegionConfiguration level as well so that it's
> applied for all the caches/tables from there?
>
>
Does not seem to make much sense until we can tweak page size for different
data regions independently (now we can't). I would start with that one
first.

Sergi


--
> Denis
>
> On Mon, Nov 19, 2018 at 2:01 AM Sergi Vladykin 
> wrote:
>
> > Folks,
> >
> > I've implemented page compression for persistent store and going to merge
> > it to master.
> >
> > https://github.com/apache/ignite/pull/5200
> >
> > Some design notes:
> >
> > It employs "hole punching" approach, it means that the pages are kept
> > uncompressed in memory,
> > but when they get written to disk, they will be compressed and all the
> > extra file system blocks for the page will be released. Thus the storage
> > files become sparse.
> >
> > Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
> > SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE, which
> > basically just takes only meaningful data from half-filled pages but does
> > not apply any compression. It is easy to add more if needed.
> >
> > Since we can release only full file system blocks which are typically 4k
> > size, user must configure page size to be at least multiple FS blocks,
> e.g.
> > 8k or 16k. It also means that max compression ratio here is fsBlockSize /
> > pageSize = 4k / 16k = 0.25
> >
> > It is possible to enable compression for existing databases if they were
> > configured for large enough page size. In this case pages will be written
> > to disk in compressed form when updated, and the database will become
> > compressed gradually.
> >
> > There will be 2 new properties on CacheConfiguration
> > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> page
> > compression.
> >
> > Compression dictionaries are not supported at the time, but may in the
> > future. IMO it should be added as a separate feature if needed.
> >
> > The only supported platform for now is Linux. Since all popular file
> > systems support sparse files, it must be  relatively easy to support more
> > platforms.
> >
> > Please take a look and provide your thoughts and suggestions.
> >
> > Thanks!
> >
> > Sergi
> >
>


[jira] [Created] (IGNITE-10332) Add Ignite.NET configuration for disk page compression

2018-11-19 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10332:
---

 Summary: Add Ignite.NET configuration for disk page compression 
 Key: IGNITE-10332
 URL: https://issues.apache.org/jira/browse/IGNITE-10332
 Project: Ignite
  Issue Type: New Feature
  Components: cache
Reporter: Sergi Vladykin
Assignee: Vladimir Ozerov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10331) Document Disk page compression

2018-11-19 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10331:
---

 Summary: Document Disk page compression
 Key: IGNITE-10331
 URL: https://issues.apache.org/jira/browse/IGNITE-10331
 Project: Ignite
  Issue Type: New Feature
  Components: documentation
Reporter: Sergi Vladykin
Assignee: Sergey Kozlov


There is an email thread titled "Disk page compression for Ignite persistent 
store"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10330) Implement disk page compression

2018-11-19 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-10330:
---

 Summary: Implement disk page compression
 Key: IGNITE-10330
 URL: https://issues.apache.org/jira/browse/IGNITE-10330
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Reporter: Sergi Vladykin
Assignee: Sergi Vladykin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Disk page compression for Ignite persistent store

2018-11-19 Thread Sergi Vladykin
Ilya,

Zstd itself has default compression level 3. I just used that number to be
consistent with the library defaults.

I will check if there is a significant difference in performance.

Sergi

пн, 19 нояб. 2018 г. в 14:59, Ilya Kasnacheev :

> Hello!
>
> You have zstd default level of 3. In my tests, zstd usually performed much
> better with compression level 2. Please consider.
>
> I admire your effort!
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 19 нояб. 2018 г. в 14:02, Sergi Vladykin :
>
> > Right now the functionality has nothing to do with WAL, but your idea
> > definitely makes sense and worth being implemented as a next step.
> >
> > Sergi
> >
> > пн, 19 нояб. 2018 г. в 13:58, Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> >
> > > Hi Sergi,
> > >
> > > It is not clear for me will your changes affect PageSnapshot WAL
> record.
> > > Is it possible to add compression support for PageSnapshot WAL record
> as
> > > well, to reduce WAL size?
> > >
> > > Thanks.
> > >
> > > On Mon, Nov 19, 2018 at 1:01 PM Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > I've implemented page compression for persistent store and going to
> > merge
> > > > it to master.
> > > >
> > > > https://github.com/apache/ignite/pull/5200
> > > >
> > > > Some design notes:
> > > >
> > > > It employs "hole punching" approach, it means that the pages are kept
> > > > uncompressed in memory,
> > > > but when they get written to disk, they will be compressed and all
> the
> > > > extra file system blocks for the page will be released. Thus the
> > storage
> > > > files become sparse.
> > > >
> > > > Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY
> and
> > > > SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE,
> > which
> > > > basically just takes only meaningful data from half-filled pages but
> > does
> > > > not apply any compression. It is easy to add more if needed.
> > > >
> > > > Since we can release only full file system blocks which are typically
> > 4k
> > > > size, user must configure page size to be at least multiple FS
> blocks,
> > > e.g.
> > > > 8k or 16k. It also means that max compression ratio here is
> > fsBlockSize /
> > > > pageSize = 4k / 16k = 0.25
> > > >
> > > > It is possible to enable compression for existing databases if they
> > were
> > > > configured for large enough page size. In this case pages will be
> > written
> > > > to disk in compressed form when updated, and the database will become
> > > > compressed gradually.
> > > >
> > > > There will be 2 new properties on CacheConfiguration
> > > > (setDiskPageCompression and setDiskPageCompressionLevel) to setup
> disk
> > > page
> > > > compression.
> > > >
> > > > Compression dictionaries are not supported at the time, but may in
> the
> > > > future. IMO it should be added as a separate feature if needed.
> > > >
> > > > The only supported platform for now is Linux. Since all popular file
> > > > systems support sparse files, it must be  relatively easy to support
> > more
> > > > platforms.
> > > >
> > > > Please take a look and provide your thoughts and suggestions.
> > > >
> > > > Thanks!
> > > >
> > > > Sergi
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrey V. Mashenkov
> > >
> >
>


Re: Disk page compression for Ignite persistent store

2018-11-19 Thread Sergi Vladykin
Right now the functionality has nothing to do with WAL, but your idea
definitely makes sense and worth being implemented as a next step.

Sergi

пн, 19 нояб. 2018 г. в 13:58, Andrey Mashenkov :

> Hi Sergi,
>
> It is not clear for me will your changes affect PageSnapshot WAL record.
> Is it possible to add compression support for PageSnapshot WAL record as
> well, to reduce WAL size?
>
> Thanks.
>
> On Mon, Nov 19, 2018 at 1:01 PM Sergi Vladykin 
> wrote:
>
> > Folks,
> >
> > I've implemented page compression for persistent store and going to merge
> > it to master.
> >
> > https://github.com/apache/ignite/pull/5200
> >
> > Some design notes:
> >
> > It employs "hole punching" approach, it means that the pages are kept
> > uncompressed in memory,
> > but when they get written to disk, they will be compressed and all the
> > extra file system blocks for the page will be released. Thus the storage
> > files become sparse.
> >
> > Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
> > SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE, which
> > basically just takes only meaningful data from half-filled pages but does
> > not apply any compression. It is easy to add more if needed.
> >
> > Since we can release only full file system blocks which are typically 4k
> > size, user must configure page size to be at least multiple FS blocks,
> e.g.
> > 8k or 16k. It also means that max compression ratio here is fsBlockSize /
> > pageSize = 4k / 16k = 0.25
> >
> > It is possible to enable compression for existing databases if they were
> > configured for large enough page size. In this case pages will be written
> > to disk in compressed form when updated, and the database will become
> > compressed gradually.
> >
> > There will be 2 new properties on CacheConfiguration
> > (setDiskPageCompression and setDiskPageCompressionLevel) to setup disk
> page
> > compression.
> >
> > Compression dictionaries are not supported at the time, but may in the
> > future. IMO it should be added as a separate feature if needed.
> >
> > The only supported platform for now is Linux. Since all popular file
> > systems support sparse files, it must be  relatively easy to support more
> > platforms.
> >
> > Please take a look and provide your thoughts and suggestions.
> >
> > Thanks!
> >
> > Sergi
> >
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>


Disk page compression for Ignite persistent store

2018-11-19 Thread Sergi Vladykin
Folks,

I've implemented page compression for persistent store and going to merge
it to master.

https://github.com/apache/ignite/pull/5200

Some design notes:

It employs "hole punching" approach, it means that the pages are kept
uncompressed in memory,
but when they get written to disk, they will be compressed and all the
extra file system blocks for the page will be released. Thus the storage
files become sparse.

Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE, which
basically just takes only meaningful data from half-filled pages but does
not apply any compression. It is easy to add more if needed.

Since we can release only full file system blocks which are typically 4k
size, user must configure page size to be at least multiple FS blocks, e.g.
8k or 16k. It also means that max compression ratio here is fsBlockSize /
pageSize = 4k / 16k = 0.25

It is possible to enable compression for existing databases if they were
configured for large enough page size. In this case pages will be written
to disk in compressed form when updated, and the database will become
compressed gradually.

There will be 2 new properties on CacheConfiguration
(setDiskPageCompression and setDiskPageCompressionLevel) to setup disk page
compression.

Compression dictionaries are not supported at the time, but may in the
future. IMO it should be added as a separate feature if needed.

The only supported platform for now is Linux. Since all popular file
systems support sparse files, it must be  relatively easy to support more
platforms.

Please take a look and provide your thoughts and suggestions.

Thanks!

Sergi


Re: Time to remove automated messages from the devlist?

2018-11-16 Thread Sergi Vladykin
I also would like to separate all the automated stuff.

Sergi

пт, 16 нояб. 2018 г. в 13:58, Павлухин Иван :

> Oleg,
>
> I join to Dmitriy. I found your summary quite interesting.
> пт, 16 нояб. 2018 г. в 13:12, Dmitriy Pavlov :
> >
> > Oleg,
> >
> > excellent research! It allows me to avoid bothering community developers
> > once again.
> >
> > Thank you for your efforts and for contributing to this discussion.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > чт, 15 нояб. 2018 г. в 23:14, Denis Magda :
> >
> > > Let's move git notifications to a separate list. As for JIRA, not sure
> it
> > > bothers me, it took me several minutes to set up all the filters to
> spread
> > > the messages out across specific folders. Otherwise, some of us might
> > > ignore subscribing to jira-list and would miss notifications when their
> > > input is needed.
> > >
> > > --
> > > Denis
> > >
> > > On Thu, Nov 15, 2018 at 12:03 PM Vladimir Ozerov  >
> > > wrote:
> > >
> > > > Dmitry,
> > > >
> > > > I am not referring to some "authoritative ASF member" as a guide for
> us.
> > > We
> > > > are on our own. What I meant is that at some point in time we were
> > > pointed
> > > > to an idea, that tons of automated messages has nothing to do with
> > > healthy
> > > > community. Which seems pretty reasonable to me.
> > > >
> > > > On Thu, Nov 15, 2018 at 10:15 PM Dmitriy Pavlov 
> > > > wrote:
> > > >
> > > > > What incubator mentor do you refer to? Incubator member are asf
> members
> > > > as
> > > > > well.
> > > > >
> > > > > I was involved at least to 3 discussions at the list started from
> Jira
> > > > > issue created.
> > > > >
> > > > > If others were not involved, it do not convince me its is not
> useful to
> > > > > keep forwarding.
> > > > >
> > > > > чт, 15 нояб. 2018 г., 21:23 Vladimir Ozerov  >:
> > > > >
> > > > > > Dmitry,
> > > > > >
> > > > > > What Apache member do you refer to?
> > > > > >
> > > > > > чт, 15 нояб. 2018 г. в 21:10, Dmitriy Pavlov  >:
> > > > > >
> > > > > > > How do you know what to watch if new tickets are not forwarded?
> > > > > > >
> > > > > > > Again, PRs are ok to remove since it is duplicate to jira, but
> jira
> > > > > > removal
> > > > > > > does not make any sense for me.
> > > > > > >
> > > > > > > Com dev folks instead suggest to forward all comments and all
> > > > activity
> > > > > > from
> > > > > > > github to the list. So if Apache member will confirm it is not
> > > useful
> > > > > to
> > > > > > > allow dev. list watchers see new issues on the list we can
> continue
> > > > > > > discussion. Openness is needed not for veterans but for all
> > > community
> > > > > > > members and users who is subscribed to the list.
> > > > > > >
> > > > > > > чт, 15 нояб. 2018 г., 21:00 Pavel Tupitsyn <
> ptupit...@apache.org>:
> > > > > > >
> > > > > > > > Personal emails for _watched_ JIRA tickets are very useful.
> > > > > > > > Emails to everyone are not.
> > > > > > > >
> > > > > > > > +1 for separate mailing list for all automated emails.
> > > > > > > > I don't think we can avoid automated emails completely, but
> dev
> > > > list
> > > > > > > should
> > > > > > > > be human-only.
> > > > > > > > So separate list is the only way.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Nov 15, 2018 at 8:11 PM Vladimir Ozerov <
> > > > > voze...@gridgain.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Completely agree with Denis. Tons of generated messages and
> > > > > community
> > > > > > > > > health are not relevant. Currently we obviously have too
> much
> > > > > tickets
> > > > > > > and
> > > > > > > > > too little communications. This is bad. But whether we
> > > accumulate
> > > > > > > > generated
> > > > > > > > > stuff here or in some other place is not important at all,
> > > > provided
> > > > > > > that
> > > > > > > > we
> > > > > > > > > can point dev-list readers to JIRA channel. And as far as
> > > > generated
> > > > > > > > stuff,
> > > > > > > > > this was one of very serious concerns of our mentors during
> > > > > > incubation
> > > > > > > > > phase - too many tickets, too little real communications.
> > > > Splitting
> > > > > > > > message
> > > > > > > > > flows will help us understand where we are.
> > > > > > > > >
> > > > > > > > > And another very interesting thing is how PMCs treat all
> these
> > > > > > > messages -
> > > > > > > > > they ignore them. When I come with that problem, one PMC
> > > proposed
> > > > > > > > solution
> > > > > > > > > - "just filter them like I do". Then I, another PMC,
> answered -
> > > > "I
> > > > > do
> > > > > > > not
> > > > > > > > > know how to filter them". Finally, third PMC, who also
> filters
> > > > > these
> > > > > > > > > messages, helped me create proper filter in GMail.
> > > > > > > > >
> > > > > > > > > Isn't it demonstrative enough that so many PMC, who are
> > > expected
> > > > to
> > > > > > > > > understand project very well and follow a lot of
> activities,
> > > find
> > > 

Re: Using materialised views for queries with joins

2018-09-18 Thread Sergi Vladykin
Sven,

Support of materialized views sounds like a huge project. I would not
expect it to appear in Ignite soon.

As far as I see you have problems with data collocation. If you can not
store the original data in replicated caches, then these views will be huge
as well and thus must be partitioned with some collocation. So it does not
change the picture too much actually.

I would suggest to go for a manual denormalization: put the same value with
different affinity keys.

Sergi

пн, 17 сент. 2018 г. в 14:22, Sven Beauprez :

> Hi Dmitry,
>
> Yes we can use those solutions in some cases, but not always.
>
> Replication is indeed the simplest, but sadly enough replicated caches are
> too much overhead. We have often a minimum of 12 nodes and all data must
> stay in sync 12x then. We do use it for small caches that don't need a lot
> of updates.
>
> We use colocation all over the place. Colocation based on affinity keys is
> not possible though for distinct data sets with only some very specific
> relationships with _some other_ dataset, well known before hand.
>  (eg. for example -not our exact use case which is more complex- items in
> a shopping basket with items from product inventory, both are in different
> caches managed on other nodes and it is not possible to denormalize such
> that the shopping basket knows the amount of availble items)
>
>
> Regards,
>
> Sven
>
>
>
>
> SVEN BEAUPREZ
>
> L e a d   A r c h i t e c t
>
>
>
> De Kleetlaan 5, B-1831 Diegem
>
> www.theglue.com 
>
> On 17/09/2018, 10:37, "Dmitriy Setrakyan"  wrote:
>
> Hi Sven,
>
> I will let others comment on the materialized view suggestion, but I
> have
> another question.
>
> *As we all know, joins are a nightmare in a distributed system*
>
>
> Have you considered collocation or replication? If a table is
> replicated,
> then it will be present on all the nodes and all joins will be fast.
> If two
> partitioned tables are colocated based on some affinity key, then
> joins on
> that affinity key will be fast as well.
>
> Both, colocation and replication are supported by Ignite. Will any of
> these
> approaches work for you?
>
> D.
>
> On Mon, Sep 17, 2018 at 11:04 AM Sven Beauprez <
> sven.beaup...@theglue.com>
> wrote:
>
> > All,
> >
> >
> >
> > We are in a situation where we have to query data over several
> caches. As
> > we all know, joins are a nightmare in a distributed system and I
> know there
> > are other means like denormalisation, but it is not sufficient
> anymore in
> > some cases we have and we need the joins.
> >
> >
> >
> > We mainly work in an OLTP context, where queries are known in
> advance (ie
> > dev time) and inpsired by following blog of several years ago, I was
> > wondering if the concept of “materialized views” could make it into
> Apache
> > Ignite.
> >
> > (
> >
> https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/
> > )
> >
> >
> >
> > It would work as follows:
> >
> >- A query must register itself in Ignite at startup time (eg. via
> >configuration) or during run time (eg. API call)
> >- The registered query is parsed and a new “view” cache is created
> >which will ‘cache’ the resultset of the query (could take a
> while, but
> >intermediate status can be “warming up” and “hot” when ready)
> >- All caches involved in the joins are now monitored for CUD
> >operations and relevant data is stored in the new “view” cache so
> the view
> >gets updated in real time
> >- All operations must be ACID compliant
> >- The view is queried via a very trivial select statement
> >
> >
> >
> > Would that be feasible as a new feature?
> >
> >
> >
> >
> >
> > Regards,
> >
> >
> >
> > Sven
> >
> >
> >
> >
> >
> >
> >
> > [image: cid:image001.png@01D3007B.4D007790]
> >
> >
> >
> > SVEN BEAUPREZ
> >
> >
> >
> > L e a d   A r c h i t e c t
> >
> >
> >
> > De Kleetlaan 5, B-1831 Diegem
> >
> > www.theglue.com
> >
>
>
>


Re: Custom string encoding

2017-07-01 Thread Sergi Vladykin
In SQL indexes we may store partial strings and assume them to be in UTF-8,
I don't think this can be abstracted away. But may be this is not a big
deal if in indexes we still will use UTF-8.

Sergi

2017-07-01 10:13 GMT+03:00 Dmitriy Setrakyan :

> Val, do you know how we compare strings in SQL queries? Will we be able to
> use this encoder?
>
> Additionally, I think that the encoder is a bit too abstract. Why not go
> even further and allow users create their own ASCII table for encoding?
>
> D.
>
> On Fri, Jun 30, 2017 at 6:49 PM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Andrey,
> >
> > Can you elaborate more on this? What is your concern?
> >
> > -Val
> >
> > On Fri, Jun 30, 2017 at 6:17 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Val,
> > >
> > > Looks like make sense.
> > >
> > > This will not affect FullText index, as Lucene has own format for
> storing
> > > data.
> > >
> > > But.. would it be compatible with H2 indexing ? I doubt.
> > >
> > > 1 июля 2017 г. 2:27 пользователь "Valentin Kulichenko" <
> > > valentin.kuliche...@gmail.com> написал:
> > >
> > > > Folks,
> > > >
> > > > Currently binary marshaller always encodes strings in UTF-8. However,
> > > > sometimes it can be useful to customize this. For example, if data
> > > contains
> > > > a lot of Cyrillic, Chinese or other symbols, but not so many Latin
> > > symbols,
> > > > memory is used very inefficiently. In this case it would be great to
> > > encode
> > > > most frequently used symbols in one byte instead of two or three.
> > > >
> > > > I propose to introduce BinaryStringEncoder interface that will
> convert
> > > > strings to byte arrays and back, and make it pluggable via
> > > > BinaryConfiguration. This will allow users to plug in any encoding
> > > > algorithms based on their requirements.
> > > >
> > > > Thoughts?
> > > >
> > > > https://issues.apache.org/jira/browse/IGNITE-5655
> > > >
> > > > -Val
> > > >
> > >
> >
>


Re: Transparent Data Encryption (TDE) in Apache Ignite

2017-06-26 Thread Sergi Vladykin
No, we don't have plans for it.

Sergi

2017-06-26 14:20 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>:

> Sergi, thanks for the answer.
>
> >> see TDE is just an option for PCI DSS compliancy but not a requirement.
> Requirement: "Protect stored cardholder data"
> Encryption is required.
> TDE - is one of ways to implement it at the database level.
>
> Sure, an implementation at the application level solve it.
>
> I meant another.
> I thought maybe this feature is in the Ignite roadmap?
>
>
> 2017-06-26 13:53 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
>
> > I think no one is interested in this stuff right now.
> >
> > Also as far as I see TDE is just an option for PCI DSS compliancy but
> not a
> > requirement.
> >
> > Your system should pass PCI DSS if you will do the required encryption at
> > the application level and will properly manage encryption keys.
> >
> > Sergi
> >
> > 2017-06-26 11:40 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>:
> >
> > > Guys, any thoughts?
> > >
> > > 2017-06-20 11:02 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>:
> > >
> > > > Hi Igniters.
> > > >
> > > > I have some user cases where I need fast storage with TDE support.
> > > > It is requered for PCI DSS certification.
> > > >
> > > > As far as I know AI doesn't support it.
> > > >
> > > > I looked at other storages.
> > > > Many storages support it or are engaged in development this feature.
> > > >
> > > > Cassandra community are working on TDE support.[1]
> > > >
> > > > Oracle support it.[2] Moreover it supports indexing and querying on
> > > > encrypted data.
> > > >
> > > > I think it will be very usefull to support TDE by AI.
> > > >
> > > > What do you think? Maybe development is already under way?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/CASSANDRA-9945
> > > > [2] https://docs.oracle.com/cd/B19306_01/network.102/b14268/
> > > > asotrans.htm#ASOAG600
> > > >
> > > > --
> > > > Best Regards, Vyacheslav D.
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav D.
> > >
> >
>
>
>
> --
> Best Regards, Vyacheslav D.
>


Re: Transparent Data Encryption (TDE) in Apache Ignite

2017-06-26 Thread Sergi Vladykin
I think no one is interested in this stuff right now.

Also as far as I see TDE is just an option for PCI DSS compliancy but not a
requirement.

Your system should pass PCI DSS if you will do the required encryption at
the application level and will properly manage encryption keys.

Sergi

2017-06-26 11:40 GMT+03:00 Vyacheslav Daradur :

> Guys, any thoughts?
>
> 2017-06-20 11:02 GMT+03:00 Vyacheslav Daradur :
>
> > Hi Igniters.
> >
> > I have some user cases where I need fast storage with TDE support.
> > It is requered for PCI DSS certification.
> >
> > As far as I know AI doesn't support it.
> >
> > I looked at other storages.
> > Many storages support it or are engaged in development this feature.
> >
> > Cassandra community are working on TDE support.[1]
> >
> > Oracle support it.[2] Moreover it supports indexing and querying on
> > encrypted data.
> >
> > I think it will be very usefull to support TDE by AI.
> >
> > What do you think? Maybe development is already under way?
> >
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-9945
> > [2] https://docs.oracle.com/cd/B19306_01/network.102/b14268/
> > asotrans.htm#ASOAG600
> >
> > --
> > Best Regards, Vyacheslav D.
> >
>
>
>
> --
> Best Regards, Vyacheslav D.
>


Re: Data compression in Ignite 2.0

2017-06-09 Thread Sergi Vladykin
+1 to Vladimir. Fields encryption is a user responsibility. I see no reason
to introduce additional complexity to Ignite.

Sergi

2017-06-09 11:11 GMT+03:00 Антон Чураев :

> Seems that Dmitry is referring to transparent data encryption. It is used
> throughout the whale database industry.
>
> 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov :
>
> > Dima,
> >
> > Encryption of certain fields is as bad as compression. First, it is a
> huge
> > change, which makes already complex binary protocol even more complex.
> > Second, it have to be ported to CPP, .NET platforms, as well as to JDBC
> and
> > ODBC.
> > Last, but the most important - this is not our headache to encrypt
> > sensitive data. This is user responsibility. Nobody in a sane mind will
> > store passwords in plain form. Instead, user should encrypt it on his
> own,
> > choosing proper encryption parameters - algorithms, key lengths, salts,
> > etc.. How are you going to expose this in API or configuration?
> >
> > We should not implement data encryption on binary level, this is out of
> > question. Encryption should be implemented on application level (user
> > efforts), transport layer (SSL - we already have it), and possibly on
> > disk-level (there are tools for this already).
> >
> >
> > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur 
> > wrote:
> >
> > > >> which is much less useful.
> > > I note, in some cases there is profit more than twice per size of an
> > > object.
> > >
> > > >> Would it be possible to change your implementation to handle the
> > > encryption instead?
> > > Yes, of cource, there's not much difference between compression and
> > > encryption, including in my implementation of per-field-compression.
> > >
> > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan :
> > >
> > > > Vyacheslav,
> > > >
> > > > When this feature started out as data compression in Ignite, it
> sounded
> > > > very useful. Now it is unfolding as a per-field compression, which is
> > > much
> > > > less useful. In fact, it is questionable whether it is useful at all.
> > The
> > > > fact that this feature is implemented does not make it mandatory for
> > the
> > > > community to accept it.
> > > >
> > > > However, as I mentioned before, per-field encryption is very useful,
> as
> > > it
> > > > would allow users automatically encrypt certain sensitive fields,
> like
> > > > passwords, credit card numbers, etc. There is not much conceptual
> > > > difference between compressing a field vs encrypting a field. Would
> it
> > be
> > > > possible to change your implementation to handle the encryption
> > instead?
> > > >
> > > > D.
> > > >
> > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> > daradu...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Guys, I want to be clear:
> > > > > * "Per-field compression" design is the result of a research of the
> > > > binary
> > > > > infrastructure of Ignite and some other its places (querying,
> > indexing,
> > > > > etc.)
> > > > > * Full-compression of object will be more effective, but in this
> case
> > > > there
> > > > > is no capability with querying and indexing (or there is large
> > overhead
> > > > by
> > > > > way of decompressing of full object (or caches pages) on demand)
> > > > > * "Per-field compression" is a one of ways to implement the
> > compression
> > > > > feature
> > > > >
> > > > > I'm new to Ignite also I can be mistaken in some things.
> > > > > Last 3-4 month I've tryed to start dicussion about a design, but
> > nobody
> > > > > answers nothing (except Dmitry and Valentin who was interested how
> it
> > > > > works).
> > > > > But I understand that this is community and nobody is obliged to
> > > anybody.
> > > > >
> > > > > There are strong Ignite experts.
> > > > > If they can help me and community with a design of the compression
> > > > feature
> > > > > it will be great.
> > > > > At the moment I have a desire and time to be engaged in development
> > of
> > > > > compression feature in Ignite.
> > > > > Let's use this opportunity :)
> > > > >
> > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan  >:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > I have never seen a single Ignite user asking about compressing a
> > > > single
> > > > > > field. However, we have had requests to secure certain fields,
> e.g.
> > > > > > passwords.
> > > > > >
> > > > > > I personally do not think per-field compression is needed, unless
> > we
> > > > can
> > > > > > point out some concrete real life use cases.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > > daradu...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Anton,
> > > > > > >
> > > > > > > >> I thought that if there will storing compressed data in the
> > > > memory,
> > > > > > data
> > > > > > > >> will transmit over wire in 

Re: [DISCUSS] Webinar for Ignite Persistent Store walk-through

2017-06-09 Thread Sergi Vladykin
+1

Sergi

2017-06-08 23:03 GMT+03:00 Dmitriy Setrakyan :

> +1 (I will attend)
>
> On Thu, Jun 8, 2017 at 1:02 PM, Konstantin Boudnik  wrote:
>
> > That'd be great! Thank you!
> > --
> >   Take care,
> > Konstantin (Cos) Boudnik
> > 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
> >
> > Disclaimer: Opinions expressed in this email are those of the author,
> > and do not necessarily represent the views of any company the author
> > might be affiliated with at the moment of writing.
> >
> >
> > On Thu, Jun 8, 2017 at 12:54 PM, Denis Magda  wrote:
> > > Igniters,
> > >
> > > What’d you think if we arrange an internal webinar for our community to
> > walk through the features, capabilities and implementation details of the
> > Ignite Persistent Store [1]? That should help us understanding the
> donation
> > better.
> > >
> > > Please reply if you will be happy to attend.
> > >
> > > [1] https://apacheignite.readme.io/docs/distributed-persistent-store <
> > https://apacheignite.readme.io/docs/distributed-persistent-store>
> > >
> > > —
> > > Denis
> >
>


Re: Key type name and value type name for CREATE TABLE

2017-06-08 Thread Sergi Vladykin
I don't think we should restrict any existing API usage with no good
reason. Lets just add some API for type name resolving.

Sergi

2017-06-08 11:13 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Denis,
>
> It is impossible to have simple type names and cache names in common case.
> E.g.:
> Schema 1: CREATE TABLE Person (...)
> Schema 2: CREATE TABLE Person (...)
>
> There definitely will be a number of limitations when working with SQL and
> non-SQL caches, we just do not see them all at the moment. For this reason,
> we'd better to treat IgniteCache.put() on SQL cache as an invalid use case
> with undefined behavior (though, technically it works in 2.1).
>
> On Thu, Jun 8, 2017 at 6:09 AM, Denis Magda <dma...@apache.org> wrote:
>
> > Vova,
> >
> >
> > > On Jun 7, 2017, at 1:20 AM, Vladimir Ozerov <voze...@gridgain.com>
> > wrote:
> > >
> > > Valya,
> > >
> > > It doesn't affect builder invoked from DML engine, as it know how to
> find
> > > these names. As per users who want to call IgniteCache.put() on a cache
> > > created through SQL - sorry, they will have hard times resolving actual
> > > type name.
> >
> > If this limitation is to be addressed in the future release then I don’t
> > have any concerns. Is it so?
> >
> > Ideally, regardless how a cache was created and its SQL schema was
> defined
> > (DDL, Spring XML and Java config), the user should be able to works with
> it
> > using all the APIs available w/o limitations.
> >
> > —
> > Denis
> >
> > > This is OK for the first release, provided that we mask type
> > > names for a reason - to avoid subtle exceptions when working with DDL
> and
> > > DML.
> > >
> > > On Wed, Jun 7, 2017 at 5:50 AM, Valentin Kulichenko <
> > > valentin.kuliche...@gmail.com> wrote:
> > >
> > >> Vova,
> > >>
> > >> If you add unique suffix losing human-readable type names, how will
> the
> > >> builder approach work? Maybe it makes sense to add an API call that
> > returns
> > >> current type name for a table?
> > >>
> > >> -Val
> > >>
> > >> On Tue, Jun 6, 2017 at 7:43 PM Dmitriy Setrakyan <
> dsetrak...@apache.org
> > >
> > >> wrote:
> > >>
> > >>> Vova,
> > >>>
> > >>> I am not sure I like the key type name the way it is. Can we add some
> > >>> separator between the table name and key name, like "_". To me
> > >> "PERSON_KEY"
> > >>> reads a lot better than "PERSONKey".
> > >>>
> > >>> D.
> > >>>
> > >>> On Tue, Jun 6, 2017 at 4:00 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com
> > >>>
> > >>> wrote:
> > >>>
> > >>>> Unique suffix is a good idea.
> > >>>>
> > >>>> Sergi
> > >>>>
> > >>>> 2017-06-06 13:51 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > >>>>
> > >>>>> Igniters,
> > >>>>>
> > >>>>> In the very first implementation of CREATE TABLE we applied the
> > >>> following
> > >>>>> rule to key and value type names:
> > >>>>> keyTypeName == tableName + "Key"
> > >>>>> valTypeName == tableName
> > >>>>>
> > >>>>> E.g.:
> > >>>>> CREATE TABLE Person ...
> > >>>>> keyTypeName == PERSONKey
> > >>>>> valTypeName == PERSON
> > >>>>>
> > >>>>> After that user could potentially create objects with these type
> > >> names
> > >>>>> manually and add them to cache through native Ignite API:
> > >>>>>
> > >>>>> BinaryObject key =
> > >>> IgniteBinary.builder("PERSONKey").addField().build();
> > >>>>> BinaryObject val = IgniteBinary.builder("PERSON")
> > >> .addField().build();
> > >>>>> IgniteCache.put(key, val);
> > >>>>>
> > >>>>> This approach has two problems:
> > >>>>> 1) Type names are not unique between different tables. it means
> that
> > >> if
> > >>>> two
> > >>>>> tables with the same name are created in different schemas, we got
> a
> > >>>>> conflict.
> > >>>>> 2) Type names are bound to binary metadata, so if user decides to
> do
> > >>> the
> > >>>>> following, he will receive and error about incompatible metadata:
> > >>>>> CREATE TABLE Person (id INT PRIMARY KEY);
> > >>>>> DROP TABLE Person;
> > >>>>> CREATE TABLE Person(in BIGINT PRIMARY KEY); // Fail because old
> meta
> > >>>> still
> > >>>>> has type "Integer".
> > >>>>>
> > >>>>> In order to avoid that I am going to add unique suffix or so (say,
> > >>> UUID)
> > >>>> to
> > >>>>> type names. This way there will be no human-readable type names any
> > >>> more,
> > >>>>> but there will be no conflicts either. In future releases we will
> > >> relax
> > >>>>> this somehow.
> > >>>>>
> > >>>>> Thoughts?
> > >>>>>
> > >>>>> Vladimir.
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>


Re: Key type name and value type name for CREATE TABLE

2017-06-07 Thread Sergi Vladykin
Vova,

May be it makes sense to have an increasing version instead of UUID?

Like sql_myschema_Person_1_Key.

So that user will be able to pass just a table name `Person` to our API and
receive a correct latest type name for that table.

What do you think?

Sergi

2017-06-07 11:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Valya,
>
> It doesn't affect builder invoked from DML engine, as it know how to find
> these names. As per users who want to call IgniteCache.put() on a cache
> created through SQL - sorry, they will have hard times resolving actual
> type name. This is OK for the first release, provided that we mask type
> names for a reason - to avoid subtle exceptions when working with DDL and
> DML.
>
> On Wed, Jun 7, 2017 at 5:50 AM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Vova,
> >
> > If you add unique suffix losing human-readable type names, how will the
> > builder approach work? Maybe it makes sense to add an API call that
> returns
> > current type name for a table?
> >
> > -Val
> >
> > On Tue, Jun 6, 2017 at 7:43 PM Dmitriy Setrakyan <dsetrak...@apache.org>
> > wrote:
> >
> > > Vova,
> > >
> > > I am not sure I like the key type name the way it is. Can we add some
> > > separator between the table name and key name, like "_". To me
> > "PERSON_KEY"
> > > reads a lot better than "PERSONKey".
> > >
> > > D.
> > >
> > > On Tue, Jun 6, 2017 at 4:00 AM, Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >
> > > wrote:
> > >
> > > > Unique suffix is a good idea.
> > > >
> > > > Sergi
> > > >
> > > > 2017-06-06 13:51 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > > >
> > > > > Igniters,
> > > > >
> > > > > In the very first implementation of CREATE TABLE we applied the
> > > following
> > > > > rule to key and value type names:
> > > > > keyTypeName == tableName + "Key"
> > > > > valTypeName == tableName
> > > > >
> > > > > E.g.:
> > > > > CREATE TABLE Person ...
> > > > > keyTypeName == PERSONKey
> > > > > valTypeName == PERSON
> > > > >
> > > > > After that user could potentially create objects with these type
> > names
> > > > > manually and add them to cache through native Ignite API:
> > > > >
> > > > > BinaryObject key =
> > > IgniteBinary.builder("PERSONKey").addField().build();
> > > > > BinaryObject val = IgniteBinary.builder("PERSON")
> > .addField().build();
> > > > > IgniteCache.put(key, val);
> > > > >
> > > > > This approach has two problems:
> > > > > 1) Type names are not unique between different tables. it means
> that
> > if
> > > > two
> > > > > tables with the same name are created in different schemas, we got
> a
> > > > > conflict.
> > > > > 2) Type names are bound to binary metadata, so if user decides to
> do
> > > the
> > > > > following, he will receive and error about incompatible metadata:
> > > > > CREATE TABLE Person (id INT PRIMARY KEY);
> > > > > DROP TABLE Person;
> > > > > CREATE TABLE Person(in BIGINT PRIMARY KEY); // Fail because old
> meta
> > > > still
> > > > > has type "Integer".
> > > > >
> > > > > In order to avoid that I am going to add unique suffix or so (say,
> > > UUID)
> > > > to
> > > > > type names. This way there will be no human-readable type names any
> > > more,
> > > > > but there will be no conflicts either. In future releases we will
> > relax
> > > > > this somehow.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Vladimir.
> > > > >
> > > >
> > >
> >
>


Re: Key type name and value type name for CREATE TABLE

2017-06-06 Thread Sergi Vladykin
Unique suffix is a good idea.

Sergi

2017-06-06 13:51 GMT+03:00 Vladimir Ozerov :

> Igniters,
>
> In the very first implementation of CREATE TABLE we applied the following
> rule to key and value type names:
> keyTypeName == tableName + "Key"
> valTypeName == tableName
>
> E.g.:
> CREATE TABLE Person ...
> keyTypeName == PERSONKey
> valTypeName == PERSON
>
> After that user could potentially create objects with these type names
> manually and add them to cache through native Ignite API:
>
> BinaryObject key = IgniteBinary.builder("PERSONKey").addField().build();
> BinaryObject val = IgniteBinary.builder("PERSON").addField().build();
> IgniteCache.put(key, val);
>
> This approach has two problems:
> 1) Type names are not unique between different tables. it means that if two
> tables with the same name are created in different schemas, we got a
> conflict.
> 2) Type names are bound to binary metadata, so if user decides to do the
> following, he will receive and error about incompatible metadata:
> CREATE TABLE Person (id INT PRIMARY KEY);
> DROP TABLE Person;
> CREATE TABLE Person(in BIGINT PRIMARY KEY); // Fail because old meta still
> has type "Integer".
>
> In order to avoid that I am going to add unique suffix or so (say, UUID) to
> type names. This way there will be no human-readable type names any more,
> but there will be no conflicts either. In future releases we will relax
> this somehow.
>
> Thoughts?
>
> Vladimir.
>


Re: Expose SqlQueryFields flags as SQL hints

2017-06-02 Thread Sergi Vladykin
To be clear I mean not to add some creative stuff under Ignite mode, but
instead add generic hints in H2 (without Ignite) first, then plug Ignite
stuff there so that it will work and look consistently with H2 hints.

I'm just trying to avoid situation when we add some our hints under Ignite
mode, after that H2 community will add their own hints in another format,
this will look freaky.

Sergi

2017-06-02 17:51 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> This was exactly what I meant. We need native H2 support here.
>
> On Fri, Jun 2, 2017 at 5:46 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > IMO the correct way is to implement generic hints for H2 first and plug
> > Ignite hints there.
> >
> > Sergi
> >
> > 2017-06-02 17:31 GMT+03:00 Alexey Kuznetsov <akuznet...@apache.org>:
> >
> > > Hints discussion on H2 user group:
> > > https://groups.google.com/d/topic/h2-database/dHwbBitzXlY/discussion
> > >
> > > On Fri, Jun 2, 2017 at 9:23 PM, Vladimir Ozerov <voze...@gridgain.com>
> > > wrote:
> > >
> > > > Well, looks like H2 doesn't support hints at the moment, and there is
> > no
> > > > way to add some custom params to SELECT (it is possible for CREATE
> > TABLE
> > > > and CREATE SCHEMA) only. Anyway, this could be nice usability
> > improvement
> > > > for us.
> > > >
> > > > On Fri, Jun 2, 2017 at 5:13 PM, Alexey Kuznetsov <
> > akuznet...@apache.org>
> > > > wrote:
> > > >
> > > > > I like this, but could you show some examples?
> > > > >
> > > > > On Fri, Jun 2, 2017 at 9:08 PM, Vladimir Ozerov <
> > voze...@gridgain.com>
> > > > > wrote:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > We have quite a few flags on SqlFields and SqlQueryFields classes
> > > which
> > > > > are
> > > > > > used for fine tuning. Probably even more flags will appear soon.
> > > AFAIK
> > > > > > special compatibility mode for Apache Ignite was added to H2
> parser
> > > > > > recently. I think we can expose all these flags as SQL hints in
> the
> > > > query
> > > > > > itself, rather then setting them programmatically (or through
> > > JDBC/ODBC
> > > > > > connection strings).
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Alexey Kuznetsov
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Alexey Kuznetsov
> > >
> >
>


Re: Expose SqlQueryFields flags as SQL hints

2017-06-02 Thread Sergi Vladykin
IMO the correct way is to implement generic hints for H2 first and plug
Ignite hints there.

Sergi

2017-06-02 17:31 GMT+03:00 Alexey Kuznetsov :

> Hints discussion on H2 user group:
> https://groups.google.com/d/topic/h2-database/dHwbBitzXlY/discussion
>
> On Fri, Jun 2, 2017 at 9:23 PM, Vladimir Ozerov 
> wrote:
>
> > Well, looks like H2 doesn't support hints at the moment, and there is no
> > way to add some custom params to SELECT (it is possible for CREATE TABLE
> > and CREATE SCHEMA) only. Anyway, this could be nice usability improvement
> > for us.
> >
> > On Fri, Jun 2, 2017 at 5:13 PM, Alexey Kuznetsov 
> > wrote:
> >
> > > I like this, but could you show some examples?
> > >
> > > On Fri, Jun 2, 2017 at 9:08 PM, Vladimir Ozerov 
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > We have quite a few flags on SqlFields and SqlQueryFields classes
> which
> > > are
> > > > used for fine tuning. Probably even more flags will appear soon.
> AFAIK
> > > > special compatibility mode for Apache Ignite was added to H2 parser
> > > > recently. I think we can expose all these flags as SQL hints in the
> > query
> > > > itself, rather then setting them programmatically (or through
> JDBC/ODBC
> > > > connection strings).
> > > >
> > > > Thoughts?
> > > >
> > > > Vladimir.
> > > >
> > >
> > >
> > >
> > > --
> > > Alexey Kuznetsov
> > >
> >
>
>
>
> --
> Alexey Kuznetsov
>


Re: Expose SqlQueryFields flags as SQL hints

2017-06-02 Thread Sergi Vladykin
I'd prefer to avoid inventing any brand new SQL syntax.

Sergi

2017-06-02 17:23 GMT+03:00 Vladimir Ozerov :

> Well, looks like H2 doesn't support hints at the moment, and there is no
> way to add some custom params to SELECT (it is possible for CREATE TABLE
> and CREATE SCHEMA) only. Anyway, this could be nice usability improvement
> for us.
>
> On Fri, Jun 2, 2017 at 5:13 PM, Alexey Kuznetsov 
> wrote:
>
> > I like this, but could you show some examples?
> >
> > On Fri, Jun 2, 2017 at 9:08 PM, Vladimir Ozerov 
> > wrote:
> >
> > > Folks,
> > >
> > > We have quite a few flags on SqlFields and SqlQueryFields classes which
> > are
> > > used for fine tuning. Probably even more flags will appear soon. AFAIK
> > > special compatibility mode for Apache Ignite was added to H2 parser
> > > recently. I think we can expose all these flags as SQL hints in the
> query
> > > itself, rather then setting them programmatically (or through JDBC/ODBC
> > > connection strings).
> > >
> > > Thoughts?
> > >
> > > Vladimir.
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
>


Re: nested SQL sub-queries with non-collocated joins

2017-06-01 Thread Sergi Vladykin
If you don't see an exception then it must be supported. This is the whole
point of this exception, right?

Sergi

2017-06-01 22:50 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> On Thu, Jun 1, 2017 at 12:32 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > I guess it must work the following way:
> >
> > If distributed joins are enabled we can try to prove that the subquery is
> > collocated, if we can't then try to rewrite it, if we can't, then throw
> an
> > exception.
> >
> > Still this can not be done 100% correct, probably we have to have some
> flag
> > which allows to disable this subquery rewriting.
> >
>
> Sergi, but how do you explain to users what is supported and what is not?
>


Re: nested SQL sub-queries with non-collocated joins

2017-06-01 Thread Sergi Vladykin
I guess it must work the following way:

If distributed joins are enabled we can try to prove that the subquery is
collocated, if we can't then try to rewrite it, if we can't, then throw an
exception.

Still this can not be done 100% correct, probably we have to have some flag
which allows to disable this subquery rewriting.

Sergi

2017-06-01 21:33 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Sergi,
>
> I am OK with any improvement here, but we need to be able to clearly state
> to a user what is supported and what is not. If we cannot clearly describe
> it, I would rather not support it at all and throw an exception.
>
> Is this going to be possible with your solution?
>
> D.
>
> On Thu, Jun 1, 2017 at 2:51 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > The approach you are suggesting will be very complex for current
> > implementation. Also most probably very inefficient.
> >
> > Actually I was thinking about another but similar approach: in many cases
> > we can rewrite a subquery in WHERE clause into JOIN subquery.
> >
> > Like the following:
> >
> > SELECT x.* FROM x WHERE x.a = (SELECT MAX(y.a) FROM y WHERE y.b = x.b)
> >
> >  ===>
> >
> > SELECT x.* FROM x, (SELECT MAX(y.a), y.b FROM y GROUP BY y.b) z WHERE
> x.b =
> > z.b
> >
> > There are still problems here:
> >
> > 1. We will not be able to rewrite all the queries.
> > 2. We should not rewrite queries like this by default because this will
> > have a noticeable performance penalty for correctly collocated
> subqueries.
> > Probably we will need some flag for that.
> >
> > Sergi
> >
> > 2017-05-31 21:26 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > Igniters (specifically Sergi),
> > >
> > > It has come to my attention today that nested sub-select statements,
> when
> > > used in combination with non-collocated joins do not work properly in
> > > Ignite.
> > >
> > > So a query like this, where A, B, and C are all stored in Partitioned
> > > caches and are **not** collocated at all, will not work.
> > >
> > >
> > > > *select * from A, B where a.id <http://a.id> = b.a_id and
> b.somefield
> > in
> > > > (select somefield from C where c.zipcode = ?)*
> > >
> > >
> > > The main reason it is not supported right now is because, in the
> absence
> > of
> > > collocation, such query may create N^N complexity and it was decided
> that
> > > it is best not supporting it at all.
> > >
> > > However, I am not sure why N^N complexity is required. Why not support
> it
> > > as follows?
> > >
> > >1. execute the nested subquery and store the result in a temporary
> > >Replicated table.
> > >2. execute the original query and use the temporary Replicated table
> > >instead of the sub-query.
> > >
> > > Sergi, given that you are the author of the code, can you provide some
> > > insight here?
> > >
> > > Thanks,
> > > D.
> > >
> >
>


Re: Geo spatial index

2017-06-01 Thread Sergi Vladykin
No, feel free to create one.

Sergi

2017-06-01 21:00 GMT+03:00 Denis Magda <dma...@apache.org>:

> Good catch. I found the ticket that’s aim is to integrate the full-text
> search indexes with the virtual page memory architecture.
> https://issues.apache.org/jira/browse/IGNITE-5371 <
> https://issues.apache.org/jira/browse/IGNITE-5371>
>
> Sergi, do we have a similar one for the geo-spatial?
>
> —
> Denis
>
> > On Jun 1, 2017, at 2:12 AM, Alexey Goncharuk <alexey.goncha...@gmail.com>
> wrote:
> >
> > Same think stays for the full-text indexes which are currently stored in
> > Lucene.
> >
> > 2017-05-24 21:56 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> >> Sergi,
> >>
> >> While we are figuring this out, what happens to the GeoSpatial
> >> functionality in the mean time? Is it going to work at all? If not,
> should
> >> we throw some sort of exception?
> >>
> >> D.
> >>
> >> On Wed, May 24, 2017 at 1:44 AM, Sergi Vladykin <
> sergi.vlady...@gmail.com>
> >> wrote:
> >>
> >>> Though this may require some changes in BPlusTree. Let me think.
> >>>
> >>> Sergi
> >>>
> >>> 2017-05-24 8:58 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> >>>
> >>>> It must not be too hard to implement kd-tree over b+tree [1].
> Depending
> >>> on
> >>>> level we have to compare either X or Y coordinate.
> >>>>
> >>>> I think we will even have a performance boost for spatial indexes
> after
> >>>> this change.
> >>>>
> >>>> [1] https://en.wikipedia.org/wiki/K-d_tree
> >>>>
> >>>> Sergi
> >>>>
> >>>> 2017-05-23 18:59 GMT+03:00 Denis Magda <dma...@apache.org>:
> >>>>
> >>>>> +1
> >>>>>
> >>>>> This looks natural considering that we switched to the new memory
> >>>>> architecture. Sergi, how difficult is to support this?
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>> On May 23, 2017, at 4:25 AM, Sergi Vladykin <
> >> sergi.vlady...@gmail.com
> >>>>
> >>>>> wrote:
> >>>>>>
> >>>>>> Guys,
> >>>>>>
> >>>>>> Looks like we have to move our geospatial indexes to the new
> >> approach
> >>>>> with
> >>>>>> BPlusTree. Right now it stores data in Java heap. This is especially
> >>>>>> important because we are going to have a persistence layer donated
> >> by
> >>>>>> GridGain and obviously geo spatial indexes will not work with it at
> >>> all.
> >>>>>>
> >>>>>> Sergi
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>


Re: nested SQL sub-queries with non-collocated joins

2017-06-01 Thread Sergi Vladykin
The approach you are suggesting will be very complex for current
implementation. Also most probably very inefficient.

Actually I was thinking about another but similar approach: in many cases
we can rewrite a subquery in WHERE clause into JOIN subquery.

Like the following:

SELECT x.* FROM x WHERE x.a = (SELECT MAX(y.a) FROM y WHERE y.b = x.b)

 ===>

SELECT x.* FROM x, (SELECT MAX(y.a), y.b FROM y GROUP BY y.b) z WHERE x.b =
z.b

There are still problems here:

1. We will not be able to rewrite all the queries.
2. We should not rewrite queries like this by default because this will
have a noticeable performance penalty for correctly collocated subqueries.
Probably we will need some flag for that.

Sergi

2017-05-31 21:26 GMT+03:00 Dmitriy Setrakyan :

> Igniters (specifically Sergi),
>
> It has come to my attention today that nested sub-select statements, when
> used in combination with non-collocated joins do not work properly in
> Ignite.
>
> So a query like this, where A, B, and C are all stored in Partitioned
> caches and are **not** collocated at all, will not work.
>
>
> > *select * from A, B where a.id  = b.a_id and b.somefield in
> > (select somefield from C where c.zipcode = ?)*
>
>
> The main reason it is not supported right now is because, in the absence of
> collocation, such query may create N^N complexity and it was decided that
> it is best not supporting it at all.
>
> However, I am not sure why N^N complexity is required. Why not support it
> as follows?
>
>1. execute the nested subquery and store the result in a temporary
>Replicated table.
>2. execute the original query and use the temporary Replicated table
>instead of the sub-query.
>
> Sergi, given that you are the author of the code, can you provide some
> insight here?
>
> Thanks,
> D.
>


Re: Summary of SQL changes in 2.1

2017-06-01 Thread Sergi Vladykin
I think it makes sense to reserve IGNITE schema for future use as well.

2017-06-01 0:26 GMT+03:00 Dmitriy Setrakyan :

> Vladmir,
>
> Thanks for the detailed email. My comments are inline...
>
> On Wed, May 31, 2017 at 11:21 AM, Vladimir Ozerov 
> wrote:
>
> > Folks,
> >
> > Let me summarize all recent changes to our SQL engine which are important
> > from user perspective. Please think of them and let me know if you have
> any
> > objection and thoughts on how to improve them.
> >
> > 1) Default "PUBLIC" schema added. It always exists and cannot be dropped.
> > Many caches can reside in this schema as opposed to earlier versions,
> where
> > every cache must be in separate schema.
> >
>
> Nice!
>
>
> > 2) Caches are still created in separate schemas by default. We should not
> > change this behavior, because it could break SQL queries of virtually all
> > users.
> >
>
> We should document, however, that this behavior will change in 3.0. Also,
> users should be able to specify that they wish to connect to the PUBLIC
> schema explicitly.
>
>
> > 3) "CREATE TABLE" creates a cache with special internal property
> > "sql=true". Such cache cannot be destroyed through "Ignite.destroyCache".
> > It can only be dropped through "DROP TABLE".The opposite is also holds:
> > static and dynamic caches cannot be dropped through "DROP TABLE".
> >
>
> Agree.
>
>
> >
> > 4) "CREATE INDEX" and "DROP INDEX" can only be executed on "sql" caches.
> >
>
> Ouch! Many of current Ignite users wish to have this functionality enabled
> for API-based caches. Any chance to lift this limitation?
>
>
> >
> > 5) There will be two predefined templates for "CREATE CACHE" command -
> > "REPLICATED" and "PARTITIONED". They are always available on any node.
> >
> > 6) Additional parameters which could be passed to "CREATE TABLE":
> > 6.1) "cacheTemplate" - name of cache template
> > 6.2) "backups" - number of backups
> > 6.3) "atomicityMode" - either "TRANSACTIONAL" or "ATOMIC"
> > 6.4) "AFFINITY KEY" - if key field should be used for affinity.
> >
>
> What are the defaults?
>
>
> >
> > Example:
> > CREATE TABLE Employee (
> > pk_id BIGINT PRIMARY KEY,
> > name VARCHAR(255),
> > org_id BIGINT AFFINITY KEY
> > ) WITH "cacheTemplate=PARTITIONED, backups=1,
> atomicityMode=TRANSACTIONAL"
> >
> > 7) Connetion string of new JDBC driver starts with "jdbc:ignite:thin://",
> > and has only [host] as mandatory parameter.
> >
> > Example: "jdbc:ignite:thin://my_machine"
> >
>
> Why not have "thin" driver by default? Will users even notice?
>
>
> >
> > 8) New bean "SqlConfiguration" will be added to "IgniteConfiguration":
> >
> > class SqlConfiguration {
> > SqlListenerConfiguration listenerCfg; // Content of this class will
> be
> > copied from OdbcConfiguration;
> > long longQryWarnTimeout; // Moved from CacheConfiguration
> >
> > // Will hold more common SQL stuff such as metrics frequency,
> > predefined schemas, etc. in future.
> > }
> >
> > class SqlListenerConfiguration {
> > String host; // Optional, bind to all interfaces if ommitted;
> > int port; // Port
> > // Other stuff copied from OdbcConfiguration
> > }
> >
> > Example of configuration with explicitly enabled listener:
> > new IgniteConfiguration().setSqlConfiguration(new
> > SqlConfiguration().setListenerConfiuration(new
> > SqlListenerConfiguration()));
> >
>
> Seems that there is one-to-one dependency between SqlConfiguration and
> SqlListenerConfiguration. This looks a bit dirty. Why not just have
> SqlConfiguration with all the properties?
>
>
> >
> > 9) SQL listener *will not be enabled by default* as it consumes resources
> > and normally will be require only on small set of nodes.
> >
>
> Again, seems to be very odd. I would like SqlConfiguration to be enabled by
> default, given that many users will now connect to Ignite using the JDBC or
> ODBC drivers.
>
>
> >
> > 10) OdbcConfiguration will be deprecated in favor of
> > SqlListenerConfiguration.
> >
>
> Again, let's just have one SqlConfiguration interface. I am OK with
> deprecating the OdbcConfiguration, assuming that it will still work.
>
>
>
> >
> > Please share your thoughts.
> >
>


Re: AffinityKeyMapper

2017-05-29 Thread Sergi Vladykin
Done.

Sergi

2017-05-26 21:26 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> On Fri, May 26, 2017 at 8:48 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Guys,
> >
> > As I see we did not drop AffinityKeyMapper for 2.0.
> >
> > May be lets at least deprecate it?
> >
>
> I think we must. Any objections?
>


Re: Default SQL schema name

2017-05-29 Thread Sergi Vladykin
PUBLIC is already default schema in H2. You can not even drop it. Oracle
does have PUBLIC schema as well.

Sergi

2017-05-29 16:54 GMT+03:00 Vladimir Ozerov :

> Folks,
>
> I am going to introduce predefined SQL schema which is always accessible on
> all Ignite nodes [1]. Now I am thinking on how to name. Ideas are welcomed.
>
> My 50 cents:
> 1) "public" - Postgres use this name
> 2) "mydb" - MySQL use this name
> 3) "ignite" - to be aligned with our product name
> 4) "default" - not the way to go, since "DEFAULT" is reserved SQL keyword.
>
> Personally I prefer "public".
>
> Any other thoughts?
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5320
>


AffinityKeyMapper

2017-05-26 Thread Sergi Vladykin
Guys,

As I see we did not drop AffinityKeyMapper for 2.0.

May be lets at least deprecate it?

Sergi


H2PkHashIndex

2017-05-25 Thread Sergi Vladykin
Guys,

Can someone explain why this strange thingy exists if it does not replace
usual PK tree index?

Sergi


Re: Geo spatial index

2017-05-24 Thread Sergi Vladykin
Though this may require some changes in BPlusTree. Let me think.

Sergi

2017-05-24 8:58 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:

> It must not be too hard to implement kd-tree over b+tree [1]. Depending on
> level we have to compare either X or Y coordinate.
>
> I think we will even have a performance boost for spatial indexes after
> this change.
>
> [1] https://en.wikipedia.org/wiki/K-d_tree
>
> Sergi
>
> 2017-05-23 18:59 GMT+03:00 Denis Magda <dma...@apache.org>:
>
>> +1
>>
>> This looks natural considering that we switched to the new memory
>> architecture. Sergi, how difficult is to support this?
>>
>> —
>> Denis
>>
>> > On May 23, 2017, at 4:25 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
>> wrote:
>> >
>> > Guys,
>> >
>> > Looks like we have to move our geospatial indexes to the new approach
>> with
>> > BPlusTree. Right now it stores data in Java heap. This is especially
>> > important because we are going to have a persistence layer donated by
>> > GridGain and obviously geo spatial indexes will not work with it at all.
>> >
>> > Sergi
>>
>>
>


Re: Geo spatial index

2017-05-23 Thread Sergi Vladykin
It must not be too hard to implement kd-tree over b+tree [1]. Depending on
level we have to compare either X or Y coordinate.

I think we will even have a performance boost for spatial indexes after
this change.

[1] https://en.wikipedia.org/wiki/K-d_tree

Sergi

2017-05-23 18:59 GMT+03:00 Denis Magda <dma...@apache.org>:

> +1
>
> This looks natural considering that we switched to the new memory
> architecture. Sergi, how difficult is to support this?
>
> —
> Denis
>
> > On May 23, 2017, at 4:25 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
> >
> > Guys,
> >
> > Looks like we have to move our geospatial indexes to the new approach
> with
> > BPlusTree. Right now it stores data in Java heap. This is especially
> > important because we are going to have a persistence layer donated by
> > GridGain and obviously geo spatial indexes will not work with it at all.
> >
> > Sergi
>
>


Geo spatial index

2017-05-23 Thread Sergi Vladykin
Guys,

Looks like we have to move our geospatial indexes to the new approach with
BPlusTree. Right now it stores data in Java heap. This is especially
important because we are going to have a persistence layer donated by
GridGain and obviously geo spatial indexes will not work with it at all.

Sergi


Re: SqlFields query result does not expose fields metadata.

2017-05-23 Thread Sergi Vladykin
Done.

Sergi

2017-05-23 13:48 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com>:

> IGNITE-5252 [1] is ready for review.
> Sergi, would you please take a look at it?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5252
>
> On Sat, May 20, 2017 at 7:07 PM, Andrey Mashenkov <
> andrey.mashen...@gmail.com> wrote:
>
> > Dmitry,
> >
> > Here is a link ot ticket [1]
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-5252
> >
> > On Sat, May 20, 2017 at 1:00 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> >> I cannot find a ticket for it. Has it been filed?
> >>
> >> On Fri, May 19, 2017 at 12:38 AM, Vladimir Ozerov <voze...@gridgain.com
> >
> >> wrote:
> >>
> >> > Ah, got it. Then I am ok with the change as well.
> >> >
> >> > On Fri, May 19, 2017 at 9:24 AM, Sergi Vladykin <
> >> sergi.vlady...@gmail.com>
> >> > wrote:
> >> >
> >> > > Nope, the proposal was to have a FieldsQueryCursor interface with
> >> > > getFieldName(int column) method, may be + some other methods we will
> >> add
> >> > > later. This does not require any complex code modifications or
> >> exposing
> >> > > internal APIs.
> >> > >
> >> > > I'm not against new SQL API, it is a good idea, but it should not
> >> prevent
> >> > > us from making easy fixes in existing API when we need it.
> >> > >
> >> > > Sergi
> >> > >
> >> > > 2017-05-18 23:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> >> > >
> >> > > > Proposal is about returning GridQueryFieldMetadata from
> QueryCursor,
> >> > > which
> >> > > > is internal interface. This interface is counterintuitive and is
> not
> >> > > ready
> >> > > > to be exposed to users. For example, it has method "typeName"
> which
> >> > > > actually returns table name. And has method "fieldTypeName" which
> >> > returns
> >> > > > something like "java.lang.Object". Add "type name" concept from
> our
> >> > > > BinaryConfiguration/QueryEntity, which have different semantics,
> >> and
> >> > you
> >> > > > end up with totally confused users on what "type name" means in
> >> Ignite.
> >> > > >
> >> > > > Let's do not expose strange things to users, and accurately create
> >> new
> >> > > > clean SQL API instead. There is no strong demand for this feature.
> >> > > >
> >> > > > On Thu, May 18, 2017 at 7:39 PM, Sergi Vladykin <
> >> > > sergi.vlady...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > It should not require any internals movement, it must be an easy
> >> fix.
> >> > > > >
> >> > > > > Sergi
> >> > > > >
> >> > > > > 2017-05-18 15:36 GMT+03:00 Vladimir Ozerov <
> voze...@gridgain.com
> >> >:
> >> > > > >
> >> > > > > > With all the changes to internals we made, new API can be
> >> created
> >> > > very
> >> > > > > > quickly somewhere around AI 2.2 or AI 2.3. Currently the whole
> >> API
> >> > is
> >> > > > > > located in the wrong place, as it is bounded to cache. So the
> >> more
> >> > we
> >> > > > add
> >> > > > > > now, the more we will deprecate in several months. Remember,
> >> that
> >> > > this
> >> > > > > > feature will require not only new interface, but moving
> existing
> >> > > > > *internal*
> >> > > > > > metadata classes to public space. These classes were never
> >> designed
> >> > > to
> >> > > > be
> >> > > > > > exposed to users in the first place.
> >> > > > > >
> >> > > > > > This is why I am strongly against this change at the moment.
> No
> >> > need
> >> > > to
> >> > > > > > make already outdated and complex API even more complex
> without
> >> > > strong
> >> > >

Re: Inefficient approach to executing remote SQL queries

2017-05-23 Thread Sergi Vladykin
Michael,

I see your point. I think it must not be too hard to start asynchronously
establishing connections to all the needed nodes.

I've created respective issue in Jira:
https://issues.apache.org/jira/browse/IGNITE-5277

Sergi

2017-05-23 11:56 GMT+03:00 Michael Griggs :

> Hi Val
>
> This is precisely my point: it's only a minor optimization until the point
> when establishing each connection takes 3-4 seconds, and we establish 32 of
> them in sequence.  At that point it becomes a serious issue: the customer
> cannot run SQL queries from their development machines without them timing
> out once out of every two or three runs.  These kind of problems undermine
> confidence in Ignite.
>
> Mike
>
>
> -Original Message-
> From: Valentin Kulichenko [mailto:valentin.kuliche...@gmail.com]
> Sent: 22 May 2017 19:15
> To: dev@ignite.apache.org
> Subject: Re: Inefficient approach to executing remote SQL queries
>
> Hi Mike,
>
> Generally, establishing connections in parallel could make sense, but note
> that in most this would be a minor optimization, because:
>
>- Under load connections are established once and then reused. If you
>observe disconnections during application lifetime under load, then
>probably this should be addressed first.
>- Actual communication is asynchronous, we use NIO for this. If
>connection already exists, sendGeneric() basically just puts a message
> into
>a queue.
>
> -Val
>
> On Mon, May 22, 2017 at 7:04 PM, Michael Griggs <
> michael.gri...@gridgain.com
> > wrote:
>
> > Hi Igniters,
> >
> >
> >
> > Whilst diagnosing a problem with a slow query, I became aware of a
> > potential issue in the Ignite codebase.  When executing a SQL query
> > that is to run remotely, the IgniteH2Indexing#send() method is called,
> > with a Collection as one of its parameters.  This
> > collection is iterated sequentially, and ctx.io().sendGeneric() is
> > called synchronously for each node.  This is inefficient if
> >
> >
> >
> > a)   This is the first execution of a query, and thus TCP connections
> > have to be established
> >
> > b)  The cost of establishing a TCP connection is high
> >
> >
> >
> > And optionally
> >
> >
> >
> > c)   There are a large number of nodes in the cluster
> >
> >
> >
> > In my current situation, developers want to run test queries from
> > their code running locally, but connected via VPN to their UAT server
> > environment.
> > The
> > cost of opening a TCP connection is in the multiple seconds, as you
> > can see from this Ignite log file snippet:
> >
> > 2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:56924,
> > rmtAddr=/10.132.80.3:47100]
> >
> > 2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:56923,
> > rmtAddr=/10.132.80.30:47102]
> >
> > 2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:56971,
> > rmtAddr=/10.132.80.23:47101]
> >
> > 2017-05-22 18:30:03,183 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:56972,
> > rmtAddr=/10.132.80.21:47100]
> >
> > 2017-05-22 18:30:06,039 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:56973,
> > rmtAddr=/10.132.80.21:47103]
> >
> > 2017-05-22 18:30:10,828 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:57020,
> > rmtAddr=/10.132.80.20:47100]
> >
> > 2017-05-22 18:30:13,060 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:57021,
> > rmtAddr=/10.132.80.29:47103]
> >
> > 2017-05-22 18:30:22,144 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:57022,
> > rmtAddr=/10.132.80.22:47103]
> >
> > 2017-05-22 18:30:26,513 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:57024,
> > rmtAddr=/10.132.80.20:47101]
> >
> > 2017-05-22 18:30:28,526 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/7.1.14.242:57025,
> > rmtAddr=/10.132.80.30:47103]
> >
> >
> >
> > Comparing the same code that is executed inside of the UAT environment
> > (so not using the VPN):
> >
> > 2017-05-22 18:22:18,102 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/10.175.11.38:53288,
> > rmtAddr=/10.175.11.58:47100]
> >
> > 2017-05-22 18:22:18,105 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/10.175.11.38:45890,
> > rmtAddr=/10.175.11.54:47101]
> >
> > 2017-05-22 18:22:18,108 INFO [TcpCommunicationSpi] - Established
> > outgoing communication connection [locAddr=/127.0.0.1:47582,
> > rmtAddr=/127.0.0.1:47100]
> >
> > 2017-05-22 18:22:18,111 INFO 

[jira] [Created] (IGNITE-5277) Asynchronously establish connection to all the needed nodes in IgniteH2Indexing.send()

2017-05-23 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-5277:
--

 Summary: Asynchronously establish connection to all the needed 
nodes in IgniteH2Indexing.send()
 Key: IGNITE-5277
 URL: https://issues.apache.org/jira/browse/IGNITE-5277
 Project: Ignite
  Issue Type: Improvement
Reporter: Sergi Vladykin


it's only a minor optimization until the point when establishing each 
connection takes 3-4 seconds, and we establish 32 of them in sequence.  At that 
point it becomes a serious issue: the customer cannot run SQL queries from 
their development machines



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Accept Contribution of Ignite Persistent Store

2017-05-23 Thread Sergi Vladykin
+1

Sergi

2017-05-23 10:20 GMT+03:00 Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> +1
>
> On Tue, May 23, 2017 at 8:42 AM, Semyon Boikov 
> wrote:
>
> > +1
> >
> > On Tue, May 23, 2017 at 12:55 AM, Denis Magda  wrote:
> >
> > > Igniters,
> > >
> > > This branch (https://github.com/apache/ignite/tree/ignite-5267) adds a
> > > distributed and transactional Persistent Store to Apache Ignite
> project.
> > > The store seamlessly integrates with Apache Ignite 2.0 page memory
> > > architecture. One of the main advantages of the store is that Apache
> > Ignite
> > > becomes fully operational from disk (SSD or Flash) without any need to
> > > preload the data in memory. Plus, with full SQL support already
> available
> > > in Apache Ignite, this feature will allow Apache Ignite serve as a
> > > distributed SQL database, both in memory or on disk, while continuing
> to
> > > support all the existing functionality on the current API.
> > > More information here:
> > > - Persistent Store Overview: https://cwiki.apache.org/
> > > confluence/display/IGNITE/Persistent+Store+Overview
> > > - Persistent Store Internal Design: https://cwiki.apache.org/
> > > confluence/display/IGNITE/Persistent+Store+Internal+Design
> > > The Persistent Store was developed by GridGain outside of Apache
> > community
> > > because it was requested by one of GridGain’s customers. Presently,
> > > GridGain looks forward to donating the Persistent Store to ASF and
> given
> > > the size of the contribution, it is prudent to follow Apache's IP
> > clearance
> > > process.
> > > The SGA has been submitted and acknowledged by ASF Secretary. The IP
> > > clearance form can be found here: http://incubator.apache.org/
> > > ip-clearance/persistent-distributed-store-ignite.html
> > > This vote is to discover if the Apache Ignite PMC and community are in
> > > favour of accepting this contribution.
> > > This vote will be open for at least 72 hours:
> > > [ ] +1, accept contribution of the Persistent Store into the project
> > > [ ] 0, no opinion
> > > [ ] -1, reject contribution because...
> > >
> > > Regards,
> > > Denis
> > >
> > >
> >
>


Re: SqlFields query result does not expose fields metadata.

2017-05-19 Thread Sergi Vladykin
Nope, the proposal was to have a FieldsQueryCursor interface with
getFieldName(int column) method, may be + some other methods we will add
later. This does not require any complex code modifications or exposing
internal APIs.

I'm not against new SQL API, it is a good idea, but it should not prevent
us from making easy fixes in existing API when we need it.

Sergi

2017-05-18 23:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Proposal is about returning GridQueryFieldMetadata from QueryCursor, which
> is internal interface. This interface is counterintuitive and is not ready
> to be exposed to users. For example, it has method "typeName" which
> actually returns table name. And has method "fieldTypeName" which returns
> something like "java.lang.Object". Add "type name" concept from our
> BinaryConfiguration/QueryEntity, which have different semantics, and you
> end up with totally confused users on what "type name" means in Ignite.
>
> Let's do not expose strange things to users, and accurately create new
> clean SQL API instead. There is no strong demand for this feature.
>
> On Thu, May 18, 2017 at 7:39 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > It should not require any internals movement, it must be an easy fix.
> >
> > Sergi
> >
> > 2017-05-18 15:36 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> >
> > > With all the changes to internals we made, new API can be created very
> > > quickly somewhere around AI 2.2 or AI 2.3. Currently the whole API is
> > > located in the wrong place, as it is bounded to cache. So the more we
> add
> > > now, the more we will deprecate in several months. Remember, that this
> > > feature will require not only new interface, but moving existing
> > *internal*
> > > metadata classes to public space. These classes were never designed to
> be
> > > exposed to users in the first place.
> > >
> > > This is why I am strongly against this change at the moment. No need to
> > > make already outdated and complex API even more complex without strong
> > > demand from users.
> > >
> > > On Thu, May 18, 2017 at 3:29 PM, Pavel Tupitsyn <ptupit...@apache.org>
> > > wrote:
> > >
> > > > I agree that this change makes sense.
> > > > With complex queries it may be non-trivial to get the right column by
> > > index
> > > > from results.
> > > > With metadata user no longer needs to care about result column order,
> > and
> > > > refactorings are easier.
> > > >
> > > > Pavel
> > > >
> > > > On Thu, May 18, 2017 at 2:36 PM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com>
> > > > wrote:
> > > >
> > > > > I believe we will not see this new SQL API soon. It is not even in
> > > design
> > > > > stage.
> > > > >
> > > > > The change proposed by Andrey is very simple and our users will
> > benefit
> > > > > from it right away.
> > > > >
> > > > > I see no reasons to disallow this change.
> > > > >
> > > > > Sergi
> > > > >
> > > > > 2017-05-18 12:35 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > > > >
> > > > > > Result set metadata is exposed to JDBC and ODBC drivers because
> it
> > is
> > > > > > required by JDBC specification and lot's external applications
> use
> > > it.
> > > > I
> > > > > do
> > > > > > not see big demand for this feature in native SQL, where user
> > > normally
> > > > > > knows the model. Another point is that with changes introduced in
> > > > recent
> > > > > > versions (DML, DDL, shared schemas), we need brand new native SQL
> > > API,
> > > > as
> > > > > > current IgniteCache.query() cannot conveniently reflect current
> and
> > > > > planned
> > > > > > Ignite capabilities.
> > > > > >
> > > > > > For this reason I do not think we should do proposed change.
> > Instead,
> > > > we
> > > > > > should add metadata retrieval to new SQL API.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > On Thu, May 18, 2017 at 12:19 PM, Andrey Mashenkov <
> > > > > > andrey.mashen...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Igniters,
> > > > > > >
> > > > > > > When user run Sql query via JDBC, he can get fields metadata
> > (field
> > > > > > names,
> > > > > > > its types and etc.) from ResultSet.
> > > > > > > With IgniteCache.query method he gets some QueryCursor
> > > > implementation,
> > > > > > but
> > > > > > > QueryCursor interface doesn't have any methods for this.
> > > > > > >
> > > > > > > For now, the only way to get metadata is try to cast result to
> > > > internal
> > > > > > > QueryCursorImpl class.
> > > > > > >
> > > > > > > I think it should break nothing if we overload
> > > > > > > IgniteCache.query(SqlFieldsQuery q) return type to a new
> > > > > > FieldsQueryCursor
> > > > > > > interface.
> > > > > > > FieldsQueryCursor will be inherits from QueryCursor and provide
> > > > > > additional
> > > > > > > methods,
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > --
> > > > > > > Best regards,
> > > > > > > Andrey V. Mashenkov
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: SqlFields query result does not expose fields metadata.

2017-05-18 Thread Sergi Vladykin
It should not require any internals movement, it must be an easy fix.

Sergi

2017-05-18 15:36 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> With all the changes to internals we made, new API can be created very
> quickly somewhere around AI 2.2 or AI 2.3. Currently the whole API is
> located in the wrong place, as it is bounded to cache. So the more we add
> now, the more we will deprecate in several months. Remember, that this
> feature will require not only new interface, but moving existing *internal*
> metadata classes to public space. These classes were never designed to be
> exposed to users in the first place.
>
> This is why I am strongly against this change at the moment. No need to
> make already outdated and complex API even more complex without strong
> demand from users.
>
> On Thu, May 18, 2017 at 3:29 PM, Pavel Tupitsyn <ptupit...@apache.org>
> wrote:
>
> > I agree that this change makes sense.
> > With complex queries it may be non-trivial to get the right column by
> index
> > from results.
> > With metadata user no longer needs to care about result column order, and
> > refactorings are easier.
> >
> > Pavel
> >
> > On Thu, May 18, 2017 at 2:36 PM, Sergi Vladykin <
> sergi.vlady...@gmail.com>
> > wrote:
> >
> > > I believe we will not see this new SQL API soon. It is not even in
> design
> > > stage.
> > >
> > > The change proposed by Andrey is very simple and our users will benefit
> > > from it right away.
> > >
> > > I see no reasons to disallow this change.
> > >
> > > Sergi
> > >
> > > 2017-05-18 12:35 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > >
> > > > Result set metadata is exposed to JDBC and ODBC drivers because it is
> > > > required by JDBC specification and lot's external applications use
> it.
> > I
> > > do
> > > > not see big demand for this feature in native SQL, where user
> normally
> > > > knows the model. Another point is that with changes introduced in
> > recent
> > > > versions (DML, DDL, shared schemas), we need brand new native SQL
> API,
> > as
> > > > current IgniteCache.query() cannot conveniently reflect current and
> > > planned
> > > > Ignite capabilities.
> > > >
> > > > For this reason I do not think we should do proposed change. Instead,
> > we
> > > > should add metadata retrieval to new SQL API.
> > > >
> > > > Vladimir.
> > > >
> > > > On Thu, May 18, 2017 at 12:19 PM, Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com> wrote:
> > > >
> > > > > Hi Igniters,
> > > > >
> > > > > When user run Sql query via JDBC, he can get fields metadata (field
> > > > names,
> > > > > its types and etc.) from ResultSet.
> > > > > With IgniteCache.query method he gets some QueryCursor
> > implementation,
> > > > but
> > > > > QueryCursor interface doesn't have any methods for this.
> > > > >
> > > > > For now, the only way to get metadata is try to cast result to
> > internal
> > > > > QueryCursorImpl class.
> > > > >
> > > > > I think it should break nothing if we overload
> > > > > IgniteCache.query(SqlFieldsQuery q) return type to a new
> > > > FieldsQueryCursor
> > > > > interface.
> > > > > FieldsQueryCursor will be inherits from QueryCursor and provide
> > > > additional
> > > > > methods,
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Andrey V. Mashenkov
> > > > >
> > > >
> > >
> >
>


Re: SqlFields query result does not expose fields metadata.

2017-05-18 Thread Sergi Vladykin
I believe we will not see this new SQL API soon. It is not even in design
stage.

The change proposed by Andrey is very simple and our users will benefit
from it right away.

I see no reasons to disallow this change.

Sergi

2017-05-18 12:35 GMT+03:00 Vladimir Ozerov :

> Result set metadata is exposed to JDBC and ODBC drivers because it is
> required by JDBC specification and lot's external applications use it. I do
> not see big demand for this feature in native SQL, where user normally
> knows the model. Another point is that with changes introduced in recent
> versions (DML, DDL, shared schemas), we need brand new native SQL API, as
> current IgniteCache.query() cannot conveniently reflect current and planned
> Ignite capabilities.
>
> For this reason I do not think we should do proposed change. Instead, we
> should add metadata retrieval to new SQL API.
>
> Vladimir.
>
> On Thu, May 18, 2017 at 12:19 PM, Andrey Mashenkov <
> andrey.mashen...@gmail.com> wrote:
>
> > Hi Igniters,
> >
> > When user run Sql query via JDBC, he can get fields metadata (field
> names,
> > its types and etc.) from ResultSet.
> > With IgniteCache.query method he gets some QueryCursor implementation,
> but
> > QueryCursor interface doesn't have any methods for this.
> >
> > For now, the only way to get metadata is try to cast result to internal
> > QueryCursorImpl class.
> >
> > I think it should break nothing if we overload
> > IgniteCache.query(SqlFieldsQuery q) return type to a new
> FieldsQueryCursor
> > interface.
> > FieldsQueryCursor will be inherits from QueryCursor and provide
> additional
> > methods,
> >
> > Thoughts?
> >
> > --
> > Best regards,
> > Andrey V. Mashenkov
> >
>


Re: [VOTE] Apache Ignite 2.0.0 RC2

2017-05-02 Thread Sergi Vladykin
+1 binding

Sergi

2017-05-02 10:31 GMT+03:00 Alexey Kuznetsov :

> Download zip with sorces: OK
> Build: mvn clean package -DskipTests -Dmaven.javadoc.skip=true: OK
> Build Web console from sources: OK
> Build Web agent  from sources: OK
> Start node: OK
> Start Web Console locally: OK
> Start Web agent: OK
> Web agent connected to node and shown in Web Console: OK
>
> +1 (binding)
>
> On Tue, May 2, 2017 at 2:31 AM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > +1
> >
> > On Mon, May 1, 2017 at 3:12 PM Vladimir Ozerov 
> > wrote:
> >
> > > +1
> > >
> > > 30 апр. 2017 г. 17:46 пользователь "Denis Magda" 
> > > написал:
> > >
> > > > Igniters!
> > > >
> > > > We have uploaded a 2.0.0 release candidate to
> > > > https://dist.apache.org/repos/dist/dev/ignite/2.0.0-rc2/ <
> > > > https://dist.apache.org/repos/dist/dev/ignite/2.0.0-rc2/>
> > > >
> > > > Git tag name is
> > > > 2.0.0-rc2
> > > >
> > > > This release includes the following changes:
> > > >
> > > > Ignite:
> > > > * Introduced new page memory architecture.
> > > > * Machine Learning beta: distributed algebra support for dense and
> > sparse
> > > > data sets.
> > > > * Reworked and simplified API for asynchronous operations.
> > > > * Custom thread pool executors for compute tasks.
> > > > * Removed CLOCK mode in ATOMIC cache.
> > > > * Deprecated schema-import utility in favor of Web Console.
> > > > * Integration with Spring Data.
> > > > * Integration with Hibernate 5.
> > > > * Integration with RocketMQ.
> > > > * Integration with ZeroMQ.
> > > > * SQL: CREATE INDEX and DROP INDEX commands.
> > > > * SQL: Ability to execute queries over specific set of partitions.
> > > > * SQL: Improved REPLICATED cache support.
> > > > * SQL: Updated H2 version to 1.4.195.
> > > > * SQL: Improved performance of MIN/MAX aggregate functions.
> > > > * ODBC: Added Time data type support.
> > > > * Massive performance improvements.
> > > >
> > > > Ignite.NET :
> > > > * Custom plugin API.
> > > > * Generic cache store.
> > > > * Binary types now can be registered dynamically.
> > > > * LINQ: join, "contains" and DateTime property support.
> > > >
> > > > Ignite CPP:
> > > > * Implemented Cache::Invoke.
> > > > * Added remote filters support to continuous queries.
> > > >
> > > > Ignite Web Console:
> > > > * Multi-cluster support.
> > > > * Possibility to configure Kubernetes IP finder.
> > > > * EnforceJoinOrder option on Queries screen.
> > > >
> > > > Complete list of closed issues:
> > > > https://issues.apache.org/jira/issues/?jql=project%20%
> > 3D%20IGNITE%20AND%
> > > > 20fixVersion%20%3D%202.0%20AND%20(status%20%3D%
> > > > 20closed%20or%20status%20%3D%20resolved)  > > > jira/issues/?jql=project%20%3D%20IGNITE%20AND%
> > 20fixVersion%20%3D%202.0%
> > > > 20AND%20(status%20%3D%20closed%20or%20status%20%3D%20resolved)>
> > > >
> > > > DEVNOTES
> > > > https://git-wip-us.apache.org/repos/asf?p=ignite.git;a=blob_
> > > > plain;f=DEVNOTES.txt;hb=refs/tags/2.0.0-rc2 <
> https://git-wip-us.apache
> > .
> > > > org/repos/asf?p=ignite.git;a=blob_plain;f=DEVNOTES.txt;hb=
> > > > refs/tags/2.0.0-rc1>
> > > >
> > > > RELEASENOTES
> > > > https://git-wip-us.apache.org/repos/asf?p=ignite.git;a=blob_
> > > > plain;f=RELEASE_NOTES.txt;hb=refs/tags/2.0.0-rc2 <
> > > > https://git-wip-us.apache.org/repos/asf?p=ignite.git;a=
> > > > blob_plain;f=RELEASE_NOTES.txt;hb=refs/tags/2.0.0-rc1>
> > > >
> > > > Please start voting.
> > > >
> > > > +1 - to accept Apache Ignite 2.0.0-rc2
> > > > 0 - don't care either way
> > > > -1 - DO NOT accept Apache Ignite 2.0.0-rc2 (explain why)
> > > >
> > > > This vote will go for 72 hours.
> > >
> >
> > --
> > Alexey Kuznetsov
> >
> >
>


Re: Null cache names

2017-04-25 Thread Sergi Vladykin
Agree, lets move forward with the simplest possible solution for now.

Sergi

2017-04-25 13:07 GMT+03:00 Vladimir Ozerov :

> Folks,
>
> I do not think it is legal to add such property to ConnectorConfiguration.
> Connector is a generic gateway to cluster resources. It should not bother
> about caches anyhow. What if there are multiple caches in the cluster? What
> is I want to access "cache A" from Memcached and "cache B" from Redis
> simultaneously? Etc.. This kind of property should be defined on client
> level, not on the server.
>
> For now, provided that 2.0 is about to be freezed, I propose to stick to
> Dmitriy's approach and use "default" cache name instead of null. This
> should work fine for AI 2.0. We will be able to improve it in further
> releases.
>
> Thoughts?
>
> Vladimir.
>
> On Tue, Apr 25, 2017 at 7:22 AM, Roman Shtykh 
> wrote:
>
> > Igor, +1 from me.We can add a field to ConnectorConfiguration (not sure
> if
> > it's a proper place, but it's shared by REST, memcached and Redis). A
> user
> > will have to create a cache, configure as needed and specify the name in
> > ConnectorConfiguration.
> > Roman
> >
> >
> >
> > On Monday, April 24, 2017 10:34 PM, Seliverstov Igor <
> > gvvinbl...@gmail.com> wrote:
> >
> >
> >  Dear Igniters,
> >
> > Seems we have almost the same issue with Memcached protocol.
> >
> > There is an ability to define a cache name via operation extras message
> > part (
> > https://github.com/memcached/memcached/wiki/
> BinaryProtocolRevamped#packet-
> > structure)
> > but it looks a bit complicated from my point of view...
> >
> > Different client implementations might provide such functionality or not
> (I
> > mean an additional info in an operation extras), so, potential user might
> > have some difficultes using Ignite as a Memcached server because of this
> > ignite-specific message part becomes mandatory.
> >
> > An alternative (an the best way, I think) is introducing a configuration
> > property to define which cache to use in case it hasn't be defined in a
> > message.
> >
> > I'll appreciate any thoughts on that.
> >
> > Regards,
> > Igor
> >
> > 2017-04-24 12:43 GMT+03:00 Roman Shtykh :
> >
> > > Vladimir,
> > > Probably we may set the cache name via https://redis.io/commands/
> > > client-setname, save it and use until the user specifies another name.
> > > #That will be the name for the default cache (mainly for STRING data).
> > For
> > > other data types, like hashes (https://redis.io/topics/data-types), I
> am
> > > thinking about putting data into caches specified by key.
> > > Or use https://redis.io/commands/config-set CONFIG SET DFLT_CACHE
> > > cache_name,and save cache name somewhere in Ignite cluster (what is the
> > > proper place to store such info?).
> > > For that, we have to implement one of the above-mentioned commands.
> > > What do you think?
> > >
> > >
> > >
> > >On Monday, April 24, 2017 4:34 PM, Vladimir Ozerov <
> > > voze...@gridgain.com> wrote:
> > >
> > >
> > >  Roman,
> > > Is it possible to define a kind of property in Redis connection string
> > (or
> > > property map) for this purpose? IMO ideally we should "externalize"
> cache
> > > name somehow, so that it can be changed with no changes to user's code.
> > >
> > > Alex,
> > > Not sure if this is a good idea as you will end up with PARTITIONED
> cache
> > > without backups with no ability to change that.
> > >
> > > On Mon, Apr 24, 2017 at 9:35 AM, Alexey Kuznetsov <
> akuznet...@apache.org
> > >
> > > wrote:
> > >
> > > > Roman,
> > > >
> > > > Just as idea, how about in case of user does not configured
> > "REDIS_CACHE"
> > > >  then create it via ignite.getOrCreateCache(new
> > > > CacheConfiguration("REDIS_CACHE"))
> > > > and prin warning to log "REDIS_CACHE not configured, using default
> > > > partitioned cache".
> > > >
> > > > What do you think?
> > > >
> > > > On Mon, Apr 24, 2017 at 10:26 AM, Roman Shtykh
> >  > > >
> > > > wrote:
> > > >
> > > > > Denis, Igor,
> > > > > What can be done now
> > > > > 1. create a default cache name for Redis data, e.g. "REDIS_CACHE"
> > that
> > > > has
> > > > > to be configured explicitly in xml file (as it is done with other
> > > caches)
> > > > > by a user if he/she needs Redis protocol.
> > > > > 2. Force users to specify cache names as prefix to their keys, so
> > that
> > > we
> > > > > can parse and switch between caches.
> > > > > The 1st one is a very quick fix I can do today. This can be
> extended
> > in
> > > > > future to have a separate cache for each data type.
> > > > > Roman
> > > > >
> > > > >
> > > > >On Monday, April 24, 2017 12:15 AM, Denis Magda <
> > > dma...@gridgain.com
> > > > >
> > > > > wrote:
> > > > >
> > > > >
> > > > >  Roman, would you suggest a quick solution for the redis
> integration
> > or
> > > > > even
> > > > > implement it in the nearest couple of days? We need to change 

Re: Cluster metrics - review for PageMemory

2017-04-25 Thread Sergi Vladykin
Looks good to me.

Sergi

2017-04-25 16:53 GMT+03:00 Alexey Goncharuk :

> Igniters,
>
> Since we moved to the PageMemory architecture, several ClusterMetrics
> methods became questionable, so I would like to discuss this before the
> release. Currently, ClusterMetrics contains the following methods:
> getNonHeapMemoryCommitted(),
> getNonHeapMemoryUsed(),
> getNonHeapMemoryInitialized(),
> getNonHeapMemoryTotal(),
> getNonHeapMemoryMaximum()
>
> I suggest we remove Total and Committed metrics, and the rest of the
> methods will have the following semantics:
> Initialized() - start size of all memory policies
> Max - max size of all memory policies
> Used - size of all allocated pages
>
> Thoughts?
>


Re: Ignite ML, next steps (IGNITE-5029)

2017-04-25 Thread Sergi Vladykin
It is preferable to avoid hard bindings to some exact scripting engines. If
user wants to plug in Groovy we should allow it.

As for DSL I believe it is a waste of time. Few years ago it was somewhat
popular idea to create DSLs for everything, but no one actually wants to
learn new quirky languages, so it it never worked out.

All in all the best choice is to have a good API that can be conveniently
used from any scripting language + ability to plug in any scripting engine
user likes.

Sergi

2017-04-25 20:31 GMT+03:00 Yury Babak :

> First of all thanks for this advice.
>
> And DSL/Scripting update:
>
> Actually it's a two separate features. The first is provide scripting from
> some web ui. I think we could use web-console as ui part and JSR 223 for
> scripting itself, Nashorn for JS and Jython for Python.
>
> And the second feature - DSL.
>
> For those features I've created IGNITE-5065.
>
> Thanks,
> Yury.
>
>
>
> --
> View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/Ignite-ML-next-steps-
> IGNITE-5029-tp17096p17211.html
> Sent from the Apache Ignite Developers mailing list archive at Nabble.com.
>


Re: Ignite ML, DSL/Scripting (IGNITE-5065)

2017-04-25 Thread Sergi Vladykin
I'm a bit out of the loop with ML and Web console, but Java scripting
engines as in JSR-233 are supported in Java 7. You can use JSR-233 API in
Web Console, implementation will be in IgniteML module, which requires Java
8 anyways. This way they will be decoupled.

Does this work for you?

Sergi

2017-04-25 21:12 GMT+03:00 Yury Babak :

> Hi all!
>
> Currently I'm working on adding scripting support for Ignite
> ML(IGNITE-5065). The basic idea is to provide possibility to create and run
> some scripts with ML algorithms over Ignite cluster using Ignite ML API and
> JS/Python as script language.
>
> I’ve exchanged several thoughts with Alexey K., Ignite web console
> maintainer and according to him it's okey to add this feature to
> web-console
> module.
>
> In this case we should add dependency to Ignite ML which require Java8.
>
> On the other hand we could create new UI module for this purpose but it
> will
> require additional time for development separate ml web console.
>
> If anyone know addition pros/cons for both ways - please advise.
>
> Thanks,
> Yury Babak.
>
>
>
> --
> View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/Ignite-ML-DSL-Scripting-
> IGNITE-5065-tp17216.html
> Sent from the Apache Ignite Developers mailing list archive at Nabble.com.
>


Re: SQL usability: catalogs, schemas and tables

2017-04-24 Thread Sergi Vladykin
Yes, we need to move on making Ignite work as any usual SQL db.

But please avoid mixing all this stuff together, lets have a separate task
(and discussion if needed) for each item in your list.

Sergi

2017-04-24 16:58 GMT+03:00 Vladimir Ozerov :

> Igniters,
>
> [long read, take a cup of coffee]
>
> Historically every SQL in Ignite must be executed against particular cache:
> SqlQuery requires cache, JDBC and ODBC drivers require cache name. This
> works fine until we add CREATE TABLE. Consider an empty cluster - how do
> you connect to it if you have no caches yet? Nohow.
>
> It looks like we cannot have convenient access to cluster unless we
> properly define and implement *schema* abstraction. ANSI SQL'92 defines
> several abstractions: cluster -> catalog -> schema -> table/view/etc..
> Every "catalog" has *INFORMATION_SCHEMA* schema, containing database
> metadata. Almost all vendors support it (notable exclusion - Oracle). Looks
> like we need to introduce similar concept and finally decouple caches from
> schemas.
>
> High-level proposal from my side
>
> 1) Let's consider Ignite cluster as a single database ("catalog" in ANSI
> SQL'92 terms).
>
> 2) It should be possible to connect to the cluster without a single user
> cache. In this case schema is not defined.
>
> 3) We must have a kind of storage for metadata. It could be either another
> system cache, or something analogous to binary metadata cache, which is
> essentially node-local data structure exchanged on node join. It should be
> aligned well with persistence feature, which is expected in AI 2.1.
>
> 4) Content of this storage will be accessible through INFORMATION_SCHEMA
> abstraction.
>
> 5) We must support "CREATE SCHEMA", "DROP SCHEMA" commands which will
> effectively create records in system cache and invoke relevant commands on
> local H2 engines of every node (now it happens implicitly on cache
> start/stop).
>
> 6) From JDBC/ODBC driver perspective schema will be defined either in
> connection string, or in runtime through "SET SCHEMA" command which is
> already supported by H2.
>
> 7) We must finally introduce new native SQL API, which will not use caches
> directly. Something like "IgniteSql sql()". See *IGNITE-4701*.
>
> Once schema is there, usage of "CREATE TABLE" and "DROP TABLE" commands
> will be simple and convenient, and it will fit naturally in user's past
> experience with conventional RDBMS.
>
> Thoughts?
>
> P.S.: CREATE/DROP TABLE feature is not blocked by this problem. It will
> work, but will be inconvenient for users.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4701
>


Re: Improve binary enum handling

2017-04-24 Thread Sergi Vladykin
I agree with Dmitriy, it is preferable to have this enum registration
optional. It will be a better user experience.

Why do we "inevitably" need it?

Sergi

2017-04-24 17:02 GMT+03:00 Vladimir Ozerov :

> Dima,
>
> No. It is normal (and inevitably) practice to register enums before they
> are used.
>
> This is how enum is created in MySQL:
>
> CREATE TABLE shirts (
> name VARCHAR(40),
> size ENUM('x-small', 'small', 'medium', 'large', 'x-large')
> );
>
> And in PostgreSQL:
>
> CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
> CREATE TABLE person (
> name text,
> current_mood mood
> );
>
> We will do the same at some point. That is, in future users will register
> enums from SQL, not from native API or configuration.
>
> Vladimir.
>
> On Mon, Apr 24, 2017 at 4:37 PM, Dmitriy Setrakyan 
> wrote:
>
> > Vladimir,
> >
> > I would really like to avoid special registration of Enums. Can you find
> a
> > way to handle it automatically?
> >
> > D.
> >
> > On Mon, Apr 24, 2017 at 6:33 AM, Vladimir Ozerov 
> > wrote:
> >
> > > Sorry, looks like I mismanaged tickets in JIRA. In fact, we implemented
> > H2
> > > part, but Ignite's part is not ready yet and is managed in IGNITE-4575
> > [1].
> > > Ticket you mentioned was an umbrella.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-4575
> > >
> > > On Mon, Apr 24, 2017 at 4:28 PM, Dmitriy Setrakyan <
> > dsetrak...@apache.org>
> > > wrote:
> > >
> > > > Vladimir,
> > > >
> > > > I am very confused. I thought we already had resolved this issue in
> > this
> > > > ticket:
> > > > https://issues.apache.org/jira/browse/IGNITE-3595
> > > >
> > > > Can you clarify?
> > > >
> > > > D.
> > > >
> > > > On Mon, Apr 24, 2017 at 5:58 AM, Vladimir Ozerov <
> voze...@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > Currently we have limited support of binary enums. The main problem
> > is
> > > > that
> > > > > we do not store any metadata about enum names. For this reason it
> is
> > > > > impossible to use enums in SQL even though H2 already supports it
> > [1].
> > > We
> > > > > need to improve enum metadata support and provide some additional
> API
> > > to
> > > > > register new enums in runtime.
> > > > >
> > > > > Proposed API:
> > > > >
> > > > > 1) Enum mappings can be defined statically in
> > BinaryTypeConfiguration:
> > > > >
> > > > > class BinaryTypeConfiguration {
> > > > > boolean isEnum;  // Old method
> > > > > *Map enumValues;* // New method
> > > > > }
> > > > >
> > > > > 2) New enum could be registered through IgniteBinary (e.g. we will
> > use
> > > it
> > > > > if enum is defined in CREATE TABLE statement). Elso it would be
> > > possible
> > > > to
> > > > > build enum using only name.
> > > > >
> > > > > interface IgniteBinary {
> > > > > BinaryObject buildEnum(String typeName, int ordinal);
> > > //
> > > > > Old
> > > > > *BinaryObject buildEnum(String typeName, String name); *
> > > >  //
> > > > > New
> > > > >
> > > > > *BinaryType defineEnum(String typeName, Map
> > > vals);*
> > > > //
> > > > > New
> > > > > }
> > > > >
> > > > > 3) BinaryObject will have new method "enumName":
> > > > >
> > > > > interface BinaryObject {
> > > > > enumOrdinal(); // Old
> > > > > *String enumName();* // New
> > > > > }
> > > > >
> > > > > 4) It would be possible to get the list of known values from
> > > BinaryType:
> > > > >
> > > > > interface BinaryType {
> > > > > boolean isEnum();  // Old
> > > > > *Collection enumValues();* // New
> > > > > }
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > [1] https://github.com/h2database/h2database/pull/487/commits
> > > > >
> > > >
> > >
> >
>


Re: SQL: Index hints

2017-04-21 Thread Sergi Vladykin
Exactly, this syntax was taken from MySQL.

Sergi

2017-04-21 9:58 GMT+03:00 Denis Magda <dma...@gridgain.com>:

> If multiple indexes are listed then H2 will pick only one of them like
> MySql does, right?
>
> Denis
>
> On Thursday, April 20, 2017, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > No, it must be USE INDEX without underscore. Also mention that multiple
> > indexes can be listed.
> >
> > http://h2database.com/html/grammar.html#table_expression
> >
> > Sergi
> >
> >
> > 2017-04-21 4:25 GMT+03:00 Denis Magda <dma...@apache.org
> <javascript:;>>:
> >
> > > Sergi, I’ve documented this feature for 2.0. Please confirm that the
> text
> > > below is technically correct:
> > >
> > > Index hints are useful in scenarios when it's known that one index is
> > more
> > > selective for certain queries than another and it's needed to instruct
> > the
> > > query optimizer to choose a more efficient execution plan. To do this
> > trick
> > > in Apache Ignite use USE_INDEX(index_list) statement that tells Ignite
> to
> > > take only one of the named indexes for query execution.
> > >
> > > Below is an example that leverages from this capability:
> > >
> > > SELECT * FROM table1 USE_INDEX(index_age)
> > >   WHERE salary > 15 AND age < 35;
> > >
> > > —
> > > Denis
> > >
> > > > On Jan 23, 2017, at 12:19 PM, Denis Magda <dma...@apache.org
> > <javascript:;>> wrote:
> > > >
> > > > Created a ticket so that we don’t forget about this new H2
> capability.
> > > > https://issues.apache.org/jira/browse/IGNITE-4594 <
> > > https://issues.apache.org/jira/browse/IGNITE-4594>
> > > >
> > > > Alexander P. feel free to assign it on yourself.
> > > >
> > > > —
> > > > Denis
> > > >
> > > >> On Jan 23, 2017, at 10:05 AM, Dmitriy Setrakyan <
> > dsetrak...@apache.org <javascript:;>>
> > > wrote:
> > > >>
> > > >> Very cool! Would be nice to add it to Ignite.
> > > >>
> > > >> On Mon, Jan 23, 2017 at 3:17 AM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com <javascript:;>>
> > > >> wrote:
> > > >>
> > > >>> Guys,
> > > >>>
> > > >>> Recently in H2 we've merged a very important feature: index hints.
> It
> > > is an
> > > >>> additional MySQL-like syntax:
> > > >>>
> > > >>> SELECT * FROM  my_table USE INDEX (index_a) WHERE A = 1
> > > >>>
> > > >>> It will be very easy to support this in Ignite.
> > > >>>
> > > >>> Alex,
> > > >>>
> > > >>> Since you are working on better SQL Enum support and it will
> require
> > H2
> > > >>> upgrade anyways, you can add this stuff to Ignite as well.
> > > >>>
> > > >>> Sergi
> > > >>>
> > > >
> > >
> > >
> >
>


Re: SQL: Index hints

2017-04-21 Thread Sergi Vladykin
No, it must be USE INDEX without underscore. Also mention that multiple
indexes can be listed.

http://h2database.com/html/grammar.html#table_expression

Sergi


2017-04-21 4:25 GMT+03:00 Denis Magda <dma...@apache.org>:

> Sergi, I’ve documented this feature for 2.0. Please confirm that the text
> below is technically correct:
>
> Index hints are useful in scenarios when it's known that one index is more
> selective for certain queries than another and it's needed to instruct the
> query optimizer to choose a more efficient execution plan. To do this trick
> in Apache Ignite use USE_INDEX(index_list) statement that tells Ignite to
> take only one of the named indexes for query execution.
>
> Below is an example that leverages from this capability:
>
> SELECT * FROM table1 USE_INDEX(index_age)
>   WHERE salary > 15 AND age < 35;
>
> —
> Denis
>
> > On Jan 23, 2017, at 12:19 PM, Denis Magda <dma...@apache.org> wrote:
> >
> > Created a ticket so that we don’t forget about this new H2 capability.
> > https://issues.apache.org/jira/browse/IGNITE-4594 <
> https://issues.apache.org/jira/browse/IGNITE-4594>
> >
> > Alexander P. feel free to assign it on yourself.
> >
> > —
> > Denis
> >
> >> On Jan 23, 2017, at 10:05 AM, Dmitriy Setrakyan <dsetrak...@apache.org>
> wrote:
> >>
> >> Very cool! Would be nice to add it to Ignite.
> >>
> >> On Mon, Jan 23, 2017 at 3:17 AM, Sergi Vladykin <
> sergi.vlady...@gmail.com>
> >> wrote:
> >>
> >>> Guys,
> >>>
> >>> Recently in H2 we've merged a very important feature: index hints. It
> is an
> >>> additional MySQL-like syntax:
> >>>
> >>> SELECT * FROM  my_table USE INDEX (index_a) WHERE A = 1
> >>>
> >>> It will be very easy to support this in Ignite.
> >>>
> >>> Alex,
> >>>
> >>> Since you are working on better SQL Enum support and it will require H2
> >>> upgrade anyways, you can add this stuff to Ignite as well.
> >>>
> >>> Sergi
> >>>
> >
>
>


Re: Page Memory behavior with default memory policy

2017-04-20 Thread Sergi Vladykin
Guys,

If we have a default of 80% of available memory then just starting few
nodes on my laptop will make it hang. This idea does not work until we have
a dynamically expandable memory pools.

Sergi

2017-04-19 22:20 GMT+03:00 Dmitriy Setrakyan :

> Sergey,
>
> I have responded in the ticket. Can you please provide the current and the
> proposed configuration examples?
>
> D.
>
> On Wed, Apr 19, 2017 at 2:34 AM, Sergey Chugunov <
> sergey.chugu...@gmail.com>
> wrote:
>
> > Guys,
> >
> > I created a ticket to implement these improvements, please take a look:
> > IGNITE-5024 
> >
> > Besides employing the idea of allocation 80% of physical memory I'm also
> > suggesting to introduce one more configuration property to specify
> default
> > MemoryPolicy's size in bytes without having to use verbose syntax of
> > *memoryPolicyConfiguration
> > *section.
> >
> > Any thoughts on this?
> >
> > Thanks,
> > Sergey.
> >
> >
> >
> > On Tue, Apr 18, 2017 at 12:12 PM, Dmitriy Setrakyan <
> dsetrak...@apache.org
> > >
> > wrote:
> >
> > > On Tue, Apr 18, 2017 at 2:09 AM, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com> wrote:
> > >
> > > > I don't see why not, this is the way our tests are currently running.
> > > > Anyways, we can think about efficient dynamic memory expansion in
> 2.1,
> > > this
> > > > may be feasible if we free up some space in PageId to encode segment
> > > > number. There is a ticket for this:
> > > > https://issues.apache.org/jira/browse/IGNITE-4921
> > >
> > >
> > > Alexey, if the operating system is already handling this for us, I
> don't
> > > see a reason to do it manually.
> > >
> > > I also like what Denis and Semyon are proposing. However, I would not
> > grab
> > > the full free memory. How about 80% of the free memory?
> > >
> > > D.
> > >
> >
>


Re: TouchedExpiryPolicy works incorrect in some cases IGNITE-4401

2017-04-19 Thread Sergi Vladykin
SQL Queries never instantiate or touch cache entries. Thus SQL queries
never affect any expiration policies.

Sergi

2017-04-19 16:42 GMT+03:00 ALEKSEY KUZNETSOV :

> I wonder if "SELECT" clause should *touch *an entry? For instance,
> cache.contains() doesn't *touch*.
>
> вт, 18 апр. 2017 г. в 12:21, ALEKSEY KUZNETSOV :
>
> > Yeah. I've already run the test with two caches. Definately, the bug
> > hidden in cache.query() method. cache.query() calls
> > IgniteH2Indexing#queryLocalSql(), which calls executeSqlQueryWithTimer,
> > and then it sinks into JdbcPreparedStatement.executeQuery(). There is no
> > key-value operations subsequent
> >
> > пн, 17 апр. 2017 г. в 21:15, Denis Magda :
> >
> >> It doesn’t matter who is right and who is wrong unless someone gets to
> >> the bottom of the issue debugging it.
> >>
> >> I would suggest to create a simple unit test with two caches and trying
> >> to reproduce the following without computations and other redundant
> stuff.
> >> Would you like to work on this?
> >>
> >> —
> >> Denis
> >>
> >> > On Apr 17, 2017, at 12:44 AM, ALEKSEY KUZNETSOV <
> >> alkuznetsov...@gmail.com> wrote:
> >> >
> >> > Why do u think so.
> >> > First of all, the output above is not correct. After 3 iteration
> >> key-value
> >> > API strats to return empty value.
> >> >
> >> > Every 5 seconds(iteration sleep time)
> >> repository.getAttributes("1").size()
> >> > is got called. Which makes an entry "touch" and the entry wont be
> >> expired
> >> > for as long as 10 seconds.
> >> >
> >> > Expiry policy says:
> >> >
> >> > An {@link ExpiryPolicy} that defines the expiry {@link Duration}
> >> > * of a Cache Entry based on when it was last touched. A touch includes
> >> > ** creation, update or **access**.*
> >> >
> >> >
> >> > пт, 14 апр. 2017 г. в 18:42, Denis Magda :
> >> >
> >> >> The iteration happens multiple time which means that the key-value
> API
> >> had
> >> >> to return an empty result set on second or third iteration. But this
> >> never
> >> >> happens.
> >> >>
> >> >> In any case, do you want to find a root of the issue and fix it?
> >> >> Otherwise, we can update the description and wait while someone else
> >> fixes
> >> >> it.
> >> >>
> >> >> —
> >> >> Denis
> >> >>
> >> >>> On Apr 14, 2017, at 1:33 AM, ALEKSEY KUZNETSOV <
> >> alkuznetsov...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Because expiry time is 10 seconds, while loop iterates every 5
> seconds
> >> >>>
> >> >>> пт, 14 апр. 2017 г. в 11:32, ALEKSEY KUZNETSOV <
> >> alkuznetsov...@gmail.com
> >> >>> :
> >> >>>
> >>  No, the bug is in SQL query, not key-value storage.
> >> 
> >>  пт, 14 апр. 2017 г. в 11:11, Vladislav Pyatkov <
> vldpyat...@gmail.com
> >> >:
> >> 
> >> > Denis, Aleksey,
> >> >
> >> > It is correct, remember I have already said something like[1].
> >> > I have no idea, why this happened in this case with SQL.
> >> >
> >> > [1]:
> >> >
> >> >
> >> >>
> >> http://apache-ignite-developers.2346864.n4.nabble.
> com/TouchedExpiryPolicy-works-incorrect-in-some-cases-
> IGNITE-4401-td16349.html#a16356
> >> >
> >> > On Fri, Apr 14, 2017 at 4:29 AM, Denis Magda 
> >> >> wrote:
> >> >
> >> >> I could reproduce the issue and this should be what Denis K.
> meant
> >> by
> >> >> saying “expiration policy works incorrectly”.
> >> >>
> >> >> If you remove the expiration policy from the caches'
> configuration
> >> >> then
> >> >> the issue disappears. In general, SQL engine processes an
> >> expiration
> >> > event
> >> >> properly because the SQL queries return an empty result set as
> >> >> expected
> >> > but
> >> >> something doesn’t work well with key-value operations.
> >> >>
> >> >> *Denis K*, *Vlad P.*, as creators of the ticket please confirm
> that
> >> >> this
> >> >> is the case.
> >> >>
> >> >> Please keep debugging this and switch to the latest Ignite
> version.
> >> >>
> >> >> —
> >> >> Denis
> >> >>
> >> >>
> >> >>> On Apr 13, 2017, at 4:22 AM, ALEKSEY KUZNETSOV <
> >> > alkuznetsov...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> any feedback?
> >> >>>
> >> >>> чт, 13 апр. 2017 г. в 11:51, ALEKSEY KUZNETSOV <
> >> > alkuznetsov...@gmail.com
> >> >>> :
> >> >>>
> >>  You should run ExpiryPolicyTest. The output should contain
> >> strings
> >> > like
> >>  contains? new AffinityKey("1", "1"): and contains?2 new
> >> >> AffinityKey("1", "
> >>  1"): and empty cursor? =
> >>  If you look at them you will see, that cache contains affinity
> >> key
> >> > new
> >>  AffinityKey("1", "1") whereas cursor is empty(on second
> >> iteration).
> >> > From
> >>  this output you can conclude SQL query returns icorrect
> >> data(empty
> >> >> value)
> >> 
> >> 

Re: Remove IGFS map-reduce in 2.0

2017-04-19 Thread Sergi Vladykin
Do we have any problems with it? I think if we have some functionality that
require no or little maintenance, then no reason to drop it.

Sergi

2017-04-19 12:28 GMT+03:00 Vladimir Ozerov :

> Folks,
>
> In old pre-Hadoop-accelerator days we implemented map-reduce over native
> IGFS API. See IgniteFileSystem.execute() methods. Later we recognized that
> users prefer to work with Hadoop API, so we implemented Hadoop Accelerator
> module.
>
> I think it is time to remove IgniteFileSystem.execute() from API as no one
> need it. Any objections?
>
> Vladimir.
>


[jira] [Created] (IGNITE-5016) SQL: Support LEFT JOIN from replicated table to a partitioned.

2017-04-18 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-5016:
--

 Summary: SQL: Support LEFT JOIN from replicated table to a 
partitioned.
 Key: IGNITE-5016
 URL: https://issues.apache.org/jira/browse/IGNITE-5016
 Project: Ignite
  Issue Type: Bug
Reporter: Sergi Vladykin
Assignee: Sergi Vladykin


Now we return duplicates:

IgniteCacheJoinPartitionedAndReplicatedTest.testReplicatedToPartitionedLeftJoin



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
Yakov,

The idea of tracking current operations and wait if needed looks
overcomplicated and most probably will result in performance drop.

I think there is no way to have this guarantee with PRIMARY_SYNC in general
case.

Sergi

2017-04-18 13:25 GMT+03:00 Yakov Zhdanov :

> Guys, what if we look at this from another point - we can switch to read
> from primary only if there is any primary_sync operation that is not acked
> by backups yet. Or we can wait until all operations of the kind are acked
> and then proceed with query. This seems to work when we have query after
> sequence of puts, but fails if we have sequence of puts then compute job
> spawning a query from remote node. And this seems to bring lots of
> complications to cache update protocol.
>
> Given this I would vote for switching default (probably, for replicated
> cache only) to full_sync and output a performance warning.
>
> However, there is still an open question - how can I guarantee query
> consistency with primary_sync?
>
> --Yakov
>


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
It only means that we will parse the query always and check if it contains
only replicated tables or not. If it does, then we execute the query on a
single node across all the partitions.

Sergi

2017-04-18 12:26 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > We never read from backups on partitioned cache, but for replicated we do
> > that to be able to execute the whole query on single node locally.\
> >
>
> But I thought that we agreed to change that behavior and have REPLICATED
> cache work the same as PARTITIONED. I think Valentin provided a link to the
> discussion we had on the dev list.
>
> I would not make FULL_SYNC as default, but I would definitely fix this
> behavior for the REPLICATED caches.
>
> D.
>


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
We never read from backups on partitioned cache, but for replicated we do
that to be able to execute the whole query on single node locally.

Sergi

2017-04-18 12:07 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Sergi, I am confused. If we don't read from backups, then why do we care
> about sync or async backup updates?
>
> On Tue, Apr 18, 2017 at 1:11 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Val,
> >
> > There we were not able to run queries against partitioned tables using
> > replicated cache API (I already fixed that in master).
> >
> > Here we are talking about query result inconsistency in case of
> > PRIMARY_SYNC
> > because of async backup update.
> >
> > Sergi
> >
> > 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko <
> > valentin.kuliche...@gmail.com>:
> >
> > > Can you please elaborate then? What is the logic there?
> > >
> > > -Val
> > >
> > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > Val,
> > > >
> > > > That discussion has nothing to do with this PRIMARY_SYNC problem.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com>:
> > > >
> > > > > Sergi,
> > > > >
> > > > > I'm talking about this discussion:
> > > > > http://apache-ignite-developers.2346864.n4.nabble.
> > > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html
> > > > >
> > > > > -Val
> > > > >
> > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Val,
> > > > > >
> > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of
> > SQL
> > > > > query
> > > > > > execution over REPLICATED cache. Also it has weird consequences
> for
> > > > > > continuous queries when coupled with another
> > > > performance-over-correctness
> > > > > > property "readFromBackup=true": user may receive CQ notification
> > with
> > > > new
> > > > > > value, but subsequent GET on local node may return old value.
> > > > > >
> > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko <
> > > > > > valentin.kuliche...@gmail.com> wrote:
> > > > > >
> > > > > > > This sounds more like an issue with query execution, rather
> than
> > > > wrong
> > > > > > > PRIMARY_SYNC
> > > > > > > behavior. We already had a discussion about this optimization
> in
> > > > > > replicated
> > > > > > > cache and decided to switch it off by default.
> > > > > > >
> > > > > > > -Val
> > > > > > >
> > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <
> > > > > > sergi.vlady...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > With replicated cache we can execute a query against backup
> > > > > partitions
> > > > > > > that
> > > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not
> > see
> > > an
> > > > > > > update.
> > > > > > > >
> > > > > > > > Sergi
> > > > > > > >
> > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <
> > > > dsetrak...@apache.org
> > > > > >:
> > > > > > > >
> > > > > > > > > Vladimir,
> > > > > > > > >
> > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't
> it
> > > > work?
> > > > > > > > >
> > > > > > > > > D.
> > > > > > > > >
> > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <
> > > > > > > voze...@gridgain.com>
> > > > > > > > > wrote:
> 

Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
Val,

There we were not able to run queries against partitioned tables using
replicated cache API (I already fixed that in master).

Here we are talking about query result inconsistency in case of PRIMARY_SYNC
because of async backup update.

Sergi

2017-04-18 11:04 GMT+03:00 Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Can you please elaborate then? What is the logic there?
>
> -Val
>
> On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Val,
> >
> > That discussion has nothing to do with this PRIMARY_SYNC problem.
> >
> > Sergi
> >
> > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko <
> > valentin.kuliche...@gmail.com>:
> >
> > > Sergi,
> > >
> > > I'm talking about this discussion:
> > > http://apache-ignite-developers.2346864.n4.nabble.
> > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html
> > >
> > > -Val
> > >
> > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <voze...@gridgain.com
> >
> > > wrote:
> > >
> > > > Val,
> > > >
> > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL
> > > query
> > > > execution over REPLICATED cache. Also it has weird consequences for
> > > > continuous queries when coupled with another
> > performance-over-correctness
> > > > property "readFromBackup=true": user may receive CQ notification with
> > new
> > > > value, but subsequent GET on local node may return old value.
> > > >
> > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com> wrote:
> > > >
> > > > > This sounds more like an issue with query execution, rather than
> > wrong
> > > > > PRIMARY_SYNC
> > > > > behavior. We already had a discussion about this optimization in
> > > > replicated
> > > > > cache and decided to switch it off by default.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <
> > > > sergi.vlady...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > With replicated cache we can execute a query against backup
> > > partitions
> > > > > that
> > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see
> an
> > > > > update.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <
> > dsetrak...@apache.org
> > > >:
> > > > > >
> > > > > > > Vladimir,
> > > > > > >
> > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it
> > work?
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <
> > > > > voze...@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Folks,
> > > > > > > >
> > > > > > > > I received a number of complaints from users that our default
> > > > setting
> > > > > > > favor
> > > > > > > > performance at the cost of correctness and subtle behavior.
> > > > > Yesterday I
> > > > > > > > faced one such situation on my own.
> > > > > > > >
> > > > > > > > I started REPLICATED cache on several nodes, put some data,
> > > > executed
> > > > > > > simple
> > > > > > > > SQL and got wrong result. No errors, no warnings. The problem
> > was
> > > > > > caused
> > > > > > > by
> > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of
> > the
> > > > > box!
> > > > > > > >
> > > > > > > > Another widely known examples are data streamer behavior,
> "read
> > > > form
> > > > > > > > backups" + continuous queries.
> > > > > > > >
> > > > > > > > I propose to change our defaults to favor *correctness* over
> > > > > > performance,
> > > > > > > > and create good documentation and JavaDocs to explain users
> how
> > > to
> > > > > tune
> > > > > > > our
> > > > > > > > product. Proposed changes:
> > > > > > > >
> > > > > > > > 1) FULL_SYNC as default;
> > > > > > > > 2) "readFromBackups=false" as default;
> > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default.
> > > > > > > >
> > > > > > > > Users should not think how to make Ignite work correctly. It
> > > should
> > > > > be
> > > > > > > > correct out of the box.
> > > > > > > >
> > > > > > > > Vladimir.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
Val,

That discussion has nothing to do with this PRIMARY_SYNC problem.

Sergi

2017-04-18 10:51 GMT+03:00 Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Sergi,
>
> I'm talking about this discussion:
> http://apache-ignite-developers.2346864.n4.nabble.
> com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html
>
> -Val
>
> On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <voze...@gridgain.com>
> wrote:
>
> > Val,
> >
> > PRIMARY_SYNC doesn't work correctly with the most common case of SQL
> query
> > execution over REPLICATED cache. Also it has weird consequences for
> > continuous queries when coupled with another performance-over-correctness
> > property "readFromBackup=true": user may receive CQ notification with new
> > value, but subsequent GET on local node may return old value.
> >
> > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko <
> > valentin.kuliche...@gmail.com> wrote:
> >
> > > This sounds more like an issue with query execution, rather than wrong
> > > PRIMARY_SYNC
> > > behavior. We already had a discussion about this optimization in
> > replicated
> > > cache and decided to switch it off by default.
> > >
> > > -Val
> > >
> > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > With replicated cache we can execute a query against backup
> partitions
> > > that
> > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an
> > > update.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
> >:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work?
> > > > >
> > > > > D.
> > > > >
> > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <
> > > voze...@gridgain.com>
> > > > > wrote:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > I received a number of complaints from users that our default
> > setting
> > > > > favor
> > > > > > performance at the cost of correctness and subtle behavior.
> > > Yesterday I
> > > > > > faced one such situation on my own.
> > > > > >
> > > > > > I started REPLICATED cache on several nodes, put some data,
> > executed
> > > > > simple
> > > > > > SQL and got wrong result. No errors, no warnings. The problem was
> > > > caused
> > > > > by
> > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the
> > > box!
> > > > > >
> > > > > > Another widely known examples are data streamer behavior, "read
> > form
> > > > > > backups" + continuous queries.
> > > > > >
> > > > > > I propose to change our defaults to favor *correctness* over
> > > > performance,
> > > > > > and create good documentation and JavaDocs to explain users how
> to
> > > tune
> > > > > our
> > > > > > product. Proposed changes:
> > > > > >
> > > > > > 1) FULL_SYNC as default;
> > > > > > 2) "readFromBackups=false" as default;
> > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default.
> > > > > >
> > > > > > Users should not think how to make Ignite work correctly. It
> should
> > > be
> > > > > > correct out of the box.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
Val,

I'm not sure I understand what optimization you are talking about and what
exactly did you decide to switch off, can you explain please?

Sergi

2017-04-18 10:42 GMT+03:00 Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> This sounds more like an issue with query execution, rather than wrong
> PRIMARY_SYNC
> behavior. We already had a discussion about this optimization in replicated
> cache and decided to switch it off by default.
>
> -Val
>
> On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > With replicated cache we can execute a query against backup partitions
> that
> > were not updated yet because of PRIMARY_SYNC. Thus we do not see an
> update.
> >
> > Sergi
> >
> > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > Vladimir,
> > >
> > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work?
> > >
> > > D.
> > >
> > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <
> voze...@gridgain.com>
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > I received a number of complaints from users that our default setting
> > > favor
> > > > performance at the cost of correctness and subtle behavior.
> Yesterday I
> > > > faced one such situation on my own.
> > > >
> > > > I started REPLICATED cache on several nodes, put some data, executed
> > > simple
> > > > SQL and got wrong result. No errors, no warnings. The problem was
> > caused
> > > by
> > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the
> box!
> > > >
> > > > Another widely known examples are data streamer behavior, "read form
> > > > backups" + continuous queries.
> > > >
> > > > I propose to change our defaults to favor *correctness* over
> > performance,
> > > > and create good documentation and JavaDocs to explain users how to
> tune
> > > our
> > > > product. Proposed changes:
> > > >
> > > > 1) FULL_SYNC as default;
> > > > 2) "readFromBackups=false" as default;
> > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default.
> > > >
> > > > Users should not think how to make Ignite work correctly. It should
> be
> > > > correct out of the box.
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>


Re: Performance vs correctness: I vote fore the second

2017-04-18 Thread Sergi Vladykin
With replicated cache we can execute a query against backup partitions that
were not updated yet because of PRIMARY_SYNC. Thus we do not see an update.

Sergi

2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan :

> Vladimir,
>
> What is wrong with a query in PRIMARY_SYNC mode? Why won't it work?
>
> D.
>
> On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov 
> wrote:
>
> > Folks,
> >
> > I received a number of complaints from users that our default setting
> favor
> > performance at the cost of correctness and subtle behavior. Yesterday I
> > faced one such situation on my own.
> >
> > I started REPLICATED cache on several nodes, put some data, executed
> simple
> > SQL and got wrong result. No errors, no warnings. The problem was caused
> by
> > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box!
> >
> > Another widely known examples are data streamer behavior, "read form
> > backups" + continuous queries.
> >
> > I propose to change our defaults to favor *correctness* over performance,
> > and create good documentation and JavaDocs to explain users how to tune
> our
> > product. Proposed changes:
> >
> > 1) FULL_SYNC as default;
> > 2) "readFromBackups=false" as default;
> > 3) "IgniteDataStreamer.allowOverwrite=true" as default.
> >
> > Users should not think how to make Ignite work correctly. It should be
> > correct out of the box.
> >
> > Vladimir.
> >
>


Re: GridGain Donates Persistent Distributed Store To ASF (Apache Ignite)

2017-04-18 Thread Sergi Vladykin
Nice! Finally Ignite from a "Middleware Solution" becoming an "All In One
Backend Solution".

Sergi

2017-04-18 5:46 GMT+03:00 Dmitriy Setrakyan :

> Great news and I am glad that GridGain was finally able to open source such
> an essential feature to Ignite. Given that I was deeply involved in the
> development of this feature for the past year, I would add that one of the
> main advantages here is that Ignite becomes fully operational from disk
> (SSD or Flash) without any need to preload the data in memory.
>
> With full SQL support available in Ignite, this feature will allow Ignite
> serve as a distributed transactional database, both in memory and on disk,
> while continuing to support all the existing use cases, including the
> in-memory data grid.
>
> Cos, can you point us to any legal paperwork that needs to be signed in
> order to complete the donation?
>
> D.
>
> On Mon, Apr 17, 2017 at 7:37 PM, Denis Magda  wrote:
>
> > Igniters,
> >
> > GridGain, as one of the most active Apache Ignite contributors, has been
> > developing a unique distributed persistent store specifically for Apache
> > Ignite for more than a year in-house. It’s a fully ACID and ANSI-99 SQL
> > compliant fault-tolerant solution.
> >
> > The store transparently integrates with Apache Ignite as an optional disk
> > layer (in addition to the existing RAM layer) via the new page memory
> > architecture that to be released in Apache Ignite 2.0. This allows
> storing
> > supersets of data on disk while having a subset in memory not worrying
> > about that you forgot to preload (warmup) your caches!
> >
> > Assuming that the storage goes to ASF as a part of Apache Ignite 2.1
> > release the following will be supported by Ignite out-of-the-box:
> >
> > * SQL queries execution over the data that is both in RAM and on disk: no
> > need to preload the whole data set in memory.
> >
> > * Cluster instantaneous restarts: once your cluster ring is recovered
> > after a restart your applications are fully operational even if they
> highly
> > utilize SQL queries.
> >
> > As for the use cases, it means that Apache Ignite will be possible to use
> > as a distributed SQL database with memory-first concept.
> >
> > And we decided at GridGain that this tremendous feature should be open
> > source from the very beginning.
> >
> > Guys, could you advise how I can start official donation process?
> >
> > —
> > Denis
> >
> >
> >
> >
> >
> >
>


Re: Discontinue Scalar in Ignite-2.0?

2017-04-14 Thread Sergi Vladykin
May be drop it in 2.0 then?

Sergi

2017-04-14 19:11 GMT+03:00 Nikita Ivanov :

> As the original developer/author of Scalar (and I already voiced this
> opinion before) - I think we should deprecate Scalar. Ignite APIs can be
> used from Scala as-is with minimal, if any, pimping. The custom DSL
> introduced by Scalar proved to be not a popular option either.
> --
> Nikita Ivanov
>
>
> On Fri, Apr 14, 2017 at 9:04 AM, Alexey Kuznetsov 
> wrote:
>
> > There were no commits for TWO years in ignite-scalar module.
> > Our java API is good, no need for any wrappers (IMHO).
> >
> > Also we even do not use it by ourselves in Visor CMD.
> >
> > Thoughts?
> >
> > --
> > Alexey Kuznetsov
> >
>


Re: Remove or deprecate IgniteAsyncSupport?

2017-04-14 Thread Sergi Vladykin
That API always was a big mistake. I'm for removing it completely.

Sergi

2017-04-14 18:01 GMT+03:00 Dmitriy Setrakyan :

> What is this obsession with breaking stuff? Let's deprecated it, it is a
> big change on API.
>
> On Fri, Apr 14, 2017 at 7:04 AM, Vladimir Ozerov 
> wrote:
>
> > Folks,
> >
> > As you know we reworked our async API and added paired asynchronous
> methods
> > instead of using IgntieAsyncSupport infrastructure. Now our API is clean
> > and straightforward.
> >
> > However, we didn't remove IgntieAsyncSupport, but deprecated it as user
> > impact might be huge. Now I think that may be ... may be it is better to
> > remove it right now thus forcing users to migrate to better API quicker?
> >
> > Thoughts?
> >
> > Vladimir.
> >
>


Re: CREATE TABLE SQL command syntax

2017-04-13 Thread Sergi Vladykin
CREATE TABLE () WITH "cacheCfgUrl or templateName or anything you want"

Sergi

2017-04-13 12:43 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Sergi,
>
> I would avoid exposing the word "CACHE" on the SQL side. I prefer that we
> work with tables. I can see a use for a table_configuration(...) function
> to create configuration templates, but how would you associate a
> configuration template with a table inside of "create table" statement?
>
> D.
>
> On Wed, Apr 12, 2017 at 11:22 PM, Sergi Vladykin <sergi.vlady...@gmail.com
> >
> wrote:
>
> > I do not think we need it.
> >
> > In standard SQL we already have KEY and COLUMN, also we already have
> CREATE
> > TABLE syntax. Adding AFFINITY to them is not a big deal.
> >
> > The thing CONFIGURATION looks like a completely new entity for SQL and I
> > prefer to avoid sticking it into H2, also I would avoid having it in
> > Ignite.
> >
> > If we need to create cache configuration templates in SQL, then lets use
> > functions:
> >
> > CALL NEW_CACHE_CONFIGURATION(...);
> >
> > They will be completely independent from H2. The only problem is that (as
> > we already discussed time ago) that for better usability we may need to
> > contribute named parameters for H2 functions. But this is the right thing
> > to do here.
> >
> > Sergi
> >
> > 2017-04-12 23:59 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > Got it. Can we also add CONFIGURATION keyword?
> > >
> > > On Wed, Apr 12, 2017 at 11:34 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Dmitriy,
> > > >
> > > > H2 does not support any "user-specific" syntax and it should not.
> > Instead
> > > > it has a concept of Mode, which is basically a setting which allows
> H2
> > to
> > > > be compatible with other databases. For example, some keywords that
> > make
> > > > sense for other databases are just ignored, but this makes the
> > statement
> > > > from other BD work in H2.
> > > >
> > > > It allows us to introduce "ApacheIgnite" mode, which will allow to
> add
> > > some
> > > > minor tweaks into Parser. These tweaks will be covered by tests and
> no
> > > one
> > > > will be able to just silently break our code.
> > > >
> > > > Actually what I see is that we do not need any custom parsing at all,
> > all
> > > > we need is just need a couple of minor tweaks (like AFFINITY
> keyword),
> > > > other SQL must work as is. Thus trying to plug in a parser looks like
> > an
> > > > overkill and fragile idea a priori.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-12 20:40 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
> >:
> > > >
> > > > > Hm... I think the truth is somewhere in the middle here.
> > > > >
> > > > > The syntax proposed by Sergi makes sense to me. However, I am still
> > > > > struggling why would H2 accept our patch, if it has AFFINITY KEY
> > > keyword
> > > > in
> > > > > it, which has nothing to do with H2.
> > > > >
> > > > > It does sound like certain portions of SQL do need to be plugable
> to
> > > > > support the user-specific syntax.
> > > > >
> > > > > Sergi, am I missing something?
> > > > >
> > > > > D.
> > > > >
> > > > > On Wed, Apr 12, 2017 at 8:51 AM, Sergi Vladykin <
> > > > sergi.vlady...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > If it is that little, then all this copy/paste shit-coding makes
> no
> > > > > sense.
> > > > > >
> > > > > > We have to add a respective mode to H2, add respective tests to
> H2,
> > > so
> > > > > that
> > > > > > other contributors of H2 will not occasionally break our stuff.
> > Thats
> > > > it.
> > > > > >
> > > > > > I will be the first H2 committer who will reject you patch, don't
> > > waste
> > > > > > your time.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-04-12 16:33 GMT+03:00 Alexander P

[jira] [Created] (IGNITE-4955) Correctly execute SQL queries started on replicated cache.

2017-04-13 Thread Sergi Vladykin (JIRA)
Sergi Vladykin created IGNITE-4955:
--

 Summary: Correctly execute SQL queries started on replicated cache.
 Key: IGNITE-4955
 URL: https://issues.apache.org/jira/browse/IGNITE-4955
 Project: Ignite
  Issue Type: Improvement
Reporter: Sergi Vladykin
Assignee: Sergi Vladykin
Priority: Blocker
 Fix For: 2.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: CREATE TABLE SQL command syntax

2017-04-13 Thread Sergi Vladykin
I do not think we need it.

In standard SQL we already have KEY and COLUMN, also we already have CREATE
TABLE syntax. Adding AFFINITY to them is not a big deal.

The thing CONFIGURATION looks like a completely new entity for SQL and I
prefer to avoid sticking it into H2, also I would avoid having it in Ignite.

If we need to create cache configuration templates in SQL, then lets use
functions:

CALL NEW_CACHE_CONFIGURATION(...);

They will be completely independent from H2. The only problem is that (as
we already discussed time ago) that for better usability we may need to
contribute named parameters for H2 functions. But this is the right thing
to do here.

Sergi

2017-04-12 23:59 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Got it. Can we also add CONFIGURATION keyword?
>
> On Wed, Apr 12, 2017 at 11:34 AM, Sergi Vladykin <sergi.vlady...@gmail.com
> >
> wrote:
>
> > Dmitriy,
> >
> > H2 does not support any "user-specific" syntax and it should not. Instead
> > it has a concept of Mode, which is basically a setting which allows H2 to
> > be compatible with other databases. For example, some keywords that make
> > sense for other databases are just ignored, but this makes the statement
> > from other BD work in H2.
> >
> > It allows us to introduce "ApacheIgnite" mode, which will allow to add
> some
> > minor tweaks into Parser. These tweaks will be covered by tests and no
> one
> > will be able to just silently break our code.
> >
> > Actually what I see is that we do not need any custom parsing at all, all
> > we need is just need a couple of minor tweaks (like AFFINITY keyword),
> > other SQL must work as is. Thus trying to plug in a parser looks like an
> > overkill and fragile idea a priori.
> >
> > Sergi
> >
> > 2017-04-12 20:40 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > Hm... I think the truth is somewhere in the middle here.
> > >
> > > The syntax proposed by Sergi makes sense to me. However, I am still
> > > struggling why would H2 accept our patch, if it has AFFINITY KEY
> keyword
> > in
> > > it, which has nothing to do with H2.
> > >
> > > It does sound like certain portions of SQL do need to be plugable to
> > > support the user-specific syntax.
> > >
> > > Sergi, am I missing something?
> > >
> > > D.
> > >
> > > On Wed, Apr 12, 2017 at 8:51 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > If it is that little, then all this copy/paste shit-coding makes no
> > > sense.
> > > >
> > > > We have to add a respective mode to H2, add respective tests to H2,
> so
> > > that
> > > > other contributors of H2 will not occasionally break our stuff. Thats
> > it.
> > > >
> > > > I will be the first H2 committer who will reject you patch, don't
> waste
> > > > your time.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-12 16:33 GMT+03:00 Alexander Paschenko <
> > > > alexander.a.pasche...@gmail.com>:
> > > >
> > > > > Sergi,
> > > > >
> > > > > First, it would be as little as overriding the part responsible for
> > > > > CREATE TABLE - there's no need to touch anything else as luckily H2
> > > > > parser is internally structured well enough.
> > > > >
> > > > > Second, although it is not all-around perfect, I am most confident
> > > > > that this is far better than dragging into H2 bunch of stuff that
> > they
> > > > > don't really need just because we need it there or can smug it
> there.
> > > > >
> > > > > I think I'll just spend some time in the weekend and come up with a
> > > > > prototype as otherwise this talk seems to be just a chit-chat.
> > > > >
> > > > > - Alex
> > > > >
> > > > > 2017-04-12 14:38 GMT+03:00 Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >:
> > > > > > So basically in inherited class you are going co copy/paste base
> > > class
> > > > > > methods and tweak them? I don't like this approach.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-04-12 14:07 GMT+03:00 Alexander Paschenko <
> > > > > > alexander.a.pasche...@gmail.com>:
> > 

Re: Question about cache and transaction on different nodes

2017-04-13 Thread Sergi Vladykin
Yes, sorry. The test works correctly: tx started on grid0 does not affect
cache1, because they are on different nodes. Thus the operation
cache1.put(1, ) is successfully committed.

Still I would not recommend to rely on any of observed behaviors here,
because Ignite was not designed for mixing caches and transactions from
different nodes in the same code. This behavior is undefined, untested and
may freely change at any time.

Sergi

2017-04-13 0:08 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> There is no bug.
>
> Dmitriy, you should introduce a variable:
>
> *cache0 = grid(0).cache(null);*
>
> Then you should use cache0 variable to do a cache put.
>
> You cannot use transaction API from grid0 and then cache API from grid1. In
> a normal environment, the cache0 and cache1 variables would not even be
> present in the same JVM - they would be on different physical servers.
>
> D.
>
> On Wed, Apr 12, 2017 at 11:08 AM, Sergi Vladykin <sergi.vlady...@gmail.com
> >
> wrote:
>
> > Looks like a bug to me.
> >
> > Sergi
> >
> > 2017-04-12 21:03 GMT+03:00 Дмитрий Рябов <somefire...@gmail.com>:
> >
> > > Why not? I do something with cache inside transaction. The only reason
> to
> > > not rollback is another node?
> > >
> > > 2017-04-12 19:52 GMT+03:00 Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> > >
> > > > Hi Dmitry,
> > > >
> > > > Looks like you start transaction on node "grid(0)", but update value
> on
> > > > another node "grid(1)".
> > > > So, technically, it is not nested transactions, right?
> > > >
> > > > On Wed, Apr 12, 2017 at 7:32 PM, Дмитрий Рябов <
> somefire...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello, igniters. I start the node and create a transactional cache
> on
> > > it,
> > > > > on the other node I start the transaction and "put" in previously
> > > created
> > > > > cache and rollback transaction. So what should I get? Value stored
> > > before
> > > > > transaction or inside rolled transaction?
> > > > >
> > > > > public void testRollback() throws Exception {
> > > > > startGrid(0);
> > > > > startGrid(1);
> > > > > IgniteCache<Integer, Integer> cache1 = grid( 1).cache(null);
> > > > > cache1.put(1, 1);
> > > > > try (Transaction tx = grid(0).transactions().
> > txStart(PESSIMISTIC,
> > > > READ_COMMITTED)) {
> > > > > cache1.put(1, );
> > > > > tx.rollback();
> > > > > }
> > > > > assertEquals((Integer) 1, cache1.get(1));
> > > > > }
> > > > >
> > > > >
> > > > > The question is why I got  instead of 1? If it is right
> > behaviour -
> > > > > why it nowhere explained?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Andrey V. Mashenkov
> > > >
> > >
> >
>


Re: Question about cache and transaction on different nodes

2017-04-12 Thread Sergi Vladykin
Looks like a bug to me.

Sergi

2017-04-12 21:03 GMT+03:00 Дмитрий Рябов :

> Why not? I do something with cache inside transaction. The only reason to
> not rollback is another node?
>
> 2017-04-12 19:52 GMT+03:00 Andrey Mashenkov :
>
> > Hi Dmitry,
> >
> > Looks like you start transaction on node "grid(0)", but update value on
> > another node "grid(1)".
> > So, technically, it is not nested transactions, right?
> >
> > On Wed, Apr 12, 2017 at 7:32 PM, Дмитрий Рябов 
> > wrote:
> >
> > > Hello, igniters. I start the node and create a transactional cache on
> it,
> > > on the other node I start the transaction and "put" in previously
> created
> > > cache and rollback transaction. So what should I get? Value stored
> before
> > > transaction or inside rolled transaction?
> > >
> > > public void testRollback() throws Exception {
> > > startGrid(0);
> > > startGrid(1);
> > > IgniteCache cache1 = grid( 1).cache(null);
> > > cache1.put(1, 1);
> > > try (Transaction tx = grid(0).transactions().txStart(PESSIMISTIC,
> > READ_COMMITTED)) {
> > > cache1.put(1, );
> > > tx.rollback();
> > > }
> > > assertEquals((Integer) 1, cache1.get(1));
> > > }
> > >
> > >
> > > The question is why I got  instead of 1? If it is right behaviour -
> > > why it nowhere explained?
> > >
> > >
> > >
> >
> >
> > --
> > Best regards,
> > Andrey V. Mashenkov
> >
>


Re: CREATE TABLE SQL command syntax

2017-04-12 Thread Sergi Vladykin
If it is that little, then all this copy/paste shit-coding makes no sense.

We have to add a respective mode to H2, add respective tests to H2, so that
other contributors of H2 will not occasionally break our stuff. Thats it.

I will be the first H2 committer who will reject you patch, don't waste
your time.

Sergi

2017-04-12 16:33 GMT+03:00 Alexander Paschenko <
alexander.a.pasche...@gmail.com>:

> Sergi,
>
> First, it would be as little as overriding the part responsible for
> CREATE TABLE - there's no need to touch anything else as luckily H2
> parser is internally structured well enough.
>
> Second, although it is not all-around perfect, I am most confident
> that this is far better than dragging into H2 bunch of stuff that they
> don't really need just because we need it there or can smug it there.
>
> I think I'll just spend some time in the weekend and come up with a
> prototype as otherwise this talk seems to be just a chit-chat.
>
> - Alex
>
> 2017-04-12 14:38 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> > So basically in inherited class you are going co copy/paste base class
> > methods and tweak them? I don't like this approach.
> >
> > Sergi
> >
> > 2017-04-12 14:07 GMT+03:00 Alexander Paschenko <
> > alexander.a.pasche...@gmail.com>:
> >
> >> Sergi,
> >>
> >> As I've written in my previous post, it would be just inheriting Parser
> on
> >> Ignite side and plugging its instance in SINGLE place. Just making H2's
> >> Parser internal methods protected instead of private would let us do the
> >> trick.
> >>
> >> — Alex
> >>
> >> среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
> >>
> >> > I don't see how you make H2 Parser extendable, you will have to add
> >> plugin
> >> > call to every *potentially* extendable place in it. In general this
> does
> >> > not work. As H2 guy I would also reject patch like this.
> >> >
> >> > Sergi
> >> >
> >> > 2017-04-12 13:10 GMT+03:00 Alexander Paschenko <
> >> > alexander.a.pasche...@gmail.com <javascript:;>>:
> >> >
> >> > > Sergi,
> >> > >
> >> > > Please have a closer look at what I've written in my first post. I
> >> don't
> >> > > see why we have to cling to H2 and its parsing modes all the time —
> >> after
> >> > > all, we're just talking string processing now, aren't we? (Yes,
> complex
> >> > and
> >> > > non trivial, but still.)
> >> > >
> >> > > What's wrong with idea of patching H2 to allow custom parsing? (With
> >> the
> >> > > parsing itself living in Ignite code, obviously, not in H2.).
> >> > >
> >> > > What I propose is just to make H2's Parser class extendable and
> make H2
> >> > > aware of its descendants via config params. And that's all with
> respect
> >> > to
> >> > > H2, nothing more.
> >> > >
> >> > > After that, on Ignite side we do all we want with our parser based
> on
> >> > > theirs. It resembles story with custom types — first we make H2
> >> > extendable
> >> > > in the way we need, then we introduce exact features we need on
> Ignite
> >> > > side.
> >> > >
> >> > > — Alex
> >> > >
> >> > > среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
> >> > >
> >> > > > It definitely makes sense to add a separate mode for Ignite in H2.
> >> > Though
> >> > > > it is wrong to think that it will allow us to add any crazy
> syntax we
> >> > > want
> >> > > > (and it is actually a wrong idea imo), only the minor variations
> of
> >> the
> >> > > > existing syntax. But this must be enough.
> >> > > >
> >> > > > I believe we should end up with something like
> >> > > >
> >> > > > CREATE TABLE person
> >> > > > (
> >> > > >   id INT PRIMARY KEY,
> >> > > >   orgId INT AFFINITY KEY,
> >> > > >   name VARCHAR
> >> > > > )
> >> > > > WITH "cfg:my_config_template.xml"
> >> > > >
> >> > > > Sergi
> >> > > >
> >> > > >
> >> > > > 2017-04-12 7:54 GMT+03:00 Dmitriy Setrakya

Re: SQL on PARTITIONED vs REPLICATED cache

2017-04-12 Thread Sergi Vladykin
Good point, but I'm not sure. The difference is that on client node you
should not be able to enable isLocal, while isReplicatedOnly is perfectly
valid. What do you think?

Sergi

2017-04-12 15:18 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com>:

> Sergi,
>
> Got it.
>
> Does query execution way and results will be same for isReplicatedOnly flag
> and for isLocal flag turned on?
> If my understanding is correct, we will get same results and there is no
> need to introduce a new flag.
>
>
>
> On Wed, Apr 12, 2017 at 2:54 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Ok, let it be an exception. I'm just saying that the thing does not work
> > now.
> >
> > Sergi
> >
> > 2017-04-12 14:50 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com
> >:
> >
> > > Sergi,
> > >
> > > I wounder how it is possible?
> > >
> > > Looks like it is impossible to run query on replicated cache, but
> select
> > > data from a
> > > partitioned table. It will result with IlleagalStateException on stable
> > > topology or
> > > IgniteCacheException on unstable topology.
> > > See ReduceQueryExecutor.stableDataNodes() and
> > > replicatedUnstableDataNodes()
> > >  methods.
> > >
> > > BTW, IlleagalStateException with no message is confusing.
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Apr 12, 2017 at 2:36 PM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > Andrey,
> > > >
> > > > Because if you run query on replicated cache, but select data from a
> > > > partitioned table, you will get only a part of the result.
> > > >
> > > > Igor,
> > > >
> > > > You are mostly right, but
> > > >
> > > > 1. Performance characteristics may change.
> > > > 2. Ignite SQL processing pipeline may not support all the stuff in H2
> > SQL
> > > > and fail in some case where it worked previously.
> > > >
> > > > Because of this the change may affect existing applications and I
> want
> > to
> > > > have it in 2.0 to make it legal.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-12 14:10 GMT+03:00 Igor Sapego <isap...@gridgain.com>:
> > > >
> > > > > Also, is it really a breaking change if the results are wrong?
> > > > > To me it looks more like a bugfix, i.e. you can't break something
> > > > > that does not work properly.
> > > > >
> > > > > Best Regards,
> > > > > Igor
> > > > >
> > > > > On Wed, Apr 12, 2017 at 2:04 PM, Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com> wrote:
> > > > >
> > > > > > Sergi,
> > > > > >
> > > > > > How can query to replicated cache leads to to wrong results?
> > > > > > Is it due to we can read backup entries?
> > > > > >
> > > > > > On Wed, Apr 12, 2017 at 12:31 PM, Sergi Vladykin <
> > > > > sergi.vlady...@gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Guys,
> > > > > > >
> > > > > > > I want to introduce another breaking change for 2.0.
> > > > > > >
> > > > > > > Currently SQL is being processed differently when we call
> method
> > > > > `query`
> > > > > > on
> > > > > > > partitioned cache and on replicated: on replicated cache we do
> > not
> > > do
> > > > > any
> > > > > > > extra processing and execute the query as is on current node.
> > > > > > >
> > > > > > > This behavior historically existed for performance reasons. But
> > it
> > > is
> > > > > not
> > > > > > > obvious and leads to wrong query results. This issue becomes
> even
> > > > more
> > > > > > > creepy with JDBC and ODBC drivers.
> > > > > > >
> > > > > > > In 2.0 I want to execute all the SQL queries the same way
> through
> > > the
> > > > > > whole
> > > > > > > processing pipeline to guaranty the correct result
> irrespectively
> > > to
> > > > > the
> > > > > > > cache that was the query originator.
> > > > > > >
> > > > > > > To be able to have the old behavior (skip all the preprocessing
> > and
> > > > run
> > > > > > > query on current node) add a flag isReplicatedOnly() on
> SqlQuery
> > > and
> > > > > > > SqlFieldsQuery. It will be disabled by default and if one knows
> > > that
> > > > > the
> > > > > > > only replicated tables participate in a query, then he can
> enable
> > > it
> > > > > for
> > > > > > > better performance.
> > > > > > >
> > > > > > > Sergi
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Andrey V. Mashenkov
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrey V. Mashenkov
> > >
> >
>
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>


Re: SQL on PARTITIONED vs REPLICATED cache

2017-04-12 Thread Sergi Vladykin
Ok, let it be an exception. I'm just saying that the thing does not work
now.

Sergi

2017-04-12 14:50 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com>:

> Sergi,
>
> I wounder how it is possible?
>
> Looks like it is impossible to run query on replicated cache, but select
> data from a
> partitioned table. It will result with IlleagalStateException on stable
> topology or
> IgniteCacheException on unstable topology.
> See ReduceQueryExecutor.stableDataNodes() and
> replicatedUnstableDataNodes()
>  methods.
>
> BTW, IlleagalStateException with no message is confusing.
>
>
>
>
>
> On Wed, Apr 12, 2017 at 2:36 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Andrey,
> >
> > Because if you run query on replicated cache, but select data from a
> > partitioned table, you will get only a part of the result.
> >
> > Igor,
> >
> > You are mostly right, but
> >
> > 1. Performance characteristics may change.
> > 2. Ignite SQL processing pipeline may not support all the stuff in H2 SQL
> > and fail in some case where it worked previously.
> >
> > Because of this the change may affect existing applications and I want to
> > have it in 2.0 to make it legal.
> >
> > Sergi
> >
> > 2017-04-12 14:10 GMT+03:00 Igor Sapego <isap...@gridgain.com>:
> >
> > > Also, is it really a breaking change if the results are wrong?
> > > To me it looks more like a bugfix, i.e. you can't break something
> > > that does not work properly.
> > >
> > > Best Regards,
> > > Igor
> > >
> > > On Wed, Apr 12, 2017 at 2:04 PM, Andrey Mashenkov <
> > > andrey.mashen...@gmail.com> wrote:
> > >
> > > > Sergi,
> > > >
> > > > How can query to replicated cache leads to to wrong results?
> > > > Is it due to we can read backup entries?
> > > >
> > > > On Wed, Apr 12, 2017 at 12:31 PM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Guys,
> > > > >
> > > > > I want to introduce another breaking change for 2.0.
> > > > >
> > > > > Currently SQL is being processed differently when we call method
> > > `query`
> > > > on
> > > > > partitioned cache and on replicated: on replicated cache we do not
> do
> > > any
> > > > > extra processing and execute the query as is on current node.
> > > > >
> > > > > This behavior historically existed for performance reasons. But it
> is
> > > not
> > > > > obvious and leads to wrong query results. This issue becomes even
> > more
> > > > > creepy with JDBC and ODBC drivers.
> > > > >
> > > > > In 2.0 I want to execute all the SQL queries the same way through
> the
> > > > whole
> > > > > processing pipeline to guaranty the correct result irrespectively
> to
> > > the
> > > > > cache that was the query originator.
> > > > >
> > > > > To be able to have the old behavior (skip all the preprocessing and
> > run
> > > > > query on current node) add a flag isReplicatedOnly() on SqlQuery
> and
> > > > > SqlFieldsQuery. It will be disabled by default and if one knows
> that
> > > the
> > > > > only replicated tables participate in a query, then he can enable
> it
> > > for
> > > > > better performance.
> > > > >
> > > > > Sergi
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Andrey V. Mashenkov
> > > >
> > >
> >
>
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>


Re: CREATE TABLE SQL command syntax

2017-04-12 Thread Sergi Vladykin
So basically in inherited class you are going co copy/paste base class
methods and tweak them? I don't like this approach.

Sergi

2017-04-12 14:07 GMT+03:00 Alexander Paschenko <
alexander.a.pasche...@gmail.com>:

> Sergi,
>
> As I've written in my previous post, it would be just inheriting Parser on
> Ignite side and plugging its instance in SINGLE place. Just making H2's
> Parser internal methods protected instead of private would let us do the
> trick.
>
> — Alex
>
> среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
>
> > I don't see how you make H2 Parser extendable, you will have to add
> plugin
> > call to every *potentially* extendable place in it. In general this does
> > not work. As H2 guy I would also reject patch like this.
> >
> > Sergi
> >
> > 2017-04-12 13:10 GMT+03:00 Alexander Paschenko <
> > alexander.a.pasche...@gmail.com <javascript:;>>:
> >
> > > Sergi,
> > >
> > > Please have a closer look at what I've written in my first post. I
> don't
> > > see why we have to cling to H2 and its parsing modes all the time —
> after
> > > all, we're just talking string processing now, aren't we? (Yes, complex
> > and
> > > non trivial, but still.)
> > >
> > > What's wrong with idea of patching H2 to allow custom parsing? (With
> the
> > > parsing itself living in Ignite code, obviously, not in H2.).
> > >
> > > What I propose is just to make H2's Parser class extendable and make H2
> > > aware of its descendants via config params. And that's all with respect
> > to
> > > H2, nothing more.
> > >
> > > After that, on Ignite side we do all we want with our parser based on
> > > theirs. It resembles story with custom types — first we make H2
> > extendable
> > > in the way we need, then we introduce exact features we need on Ignite
> > > side.
> > >
> > > — Alex
> > >
> > > среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
> > >
> > > > It definitely makes sense to add a separate mode for Ignite in H2.
> > Though
> > > > it is wrong to think that it will allow us to add any crazy syntax we
> > > want
> > > > (and it is actually a wrong idea imo), only the minor variations of
> the
> > > > existing syntax. But this must be enough.
> > > >
> > > > I believe we should end up with something like
> > > >
> > > > CREATE TABLE person
> > > > (
> > > >   id INT PRIMARY KEY,
> > > >   orgId INT AFFINITY KEY,
> > > >   name VARCHAR
> > > > )
> > > > WITH "cfg:my_config_template.xml"
> > > >
> > > > Sergi
> > > >
> > > >
> > > > 2017-04-12 7:54 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
> > <javascript:;>
> > > > <javascript:;>>:
> > > >
> > > > > Agree, the updated syntax looks better. One change though: KEY ->
> > > PRIMARY
> > > > > KEY.
> > > > >
> > > > > Sergi, what do you think?
> > > > >
> > > > > D.
> > > > >
> > > > > On Tue, Apr 11, 2017 at 9:50 PM, Pavel Tupitsyn <
> > ptupit...@apache.org <javascript:;>
> > > > <javascript:;>>
> > > > > wrote:
> > > > >
> > > > > > I think "WITH" syntax is ugly and cumbersome.
> > > > > >
> > > > > > We should go with this one:
> > > > > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
> > > > > > varchar, lastName varchar)
> > > > > >
> > > > > > All databases (i.e. [1], [2]) work this way, I see no reason to
> > > invent
> > > > > > something different and confuse the users.
> > > > > >
> > > > > > [1]
> > > > > > https://docs.microsoft.com/en-us/sql/t-sql/statements/create
> > > > > > -table-transact-sql#syntax-1
> > > > > > [2] https://www.postgresql.org/docs/9.1/static/sql-
> > createtable.html
> > > > > >
> > > > > > On Wed, Apr 12, 2017 at 6:12 AM, Alexander Paschenko <
> > > > > > alexander.a.pasche...@gmail.com <javascript:;> <javascript:;>>
> > wrote:
> > > > > >
> > > > > > > Dmit

Re: SQL on PARTITIONED vs REPLICATED cache

2017-04-12 Thread Sergi Vladykin
Andrey,

Because if you run query on replicated cache, but select data from a
partitioned table, you will get only a part of the result.

Igor,

You are mostly right, but

1. Performance characteristics may change.
2. Ignite SQL processing pipeline may not support all the stuff in H2 SQL
and fail in some case where it worked previously.

Because of this the change may affect existing applications and I want to
have it in 2.0 to make it legal.

Sergi

2017-04-12 14:10 GMT+03:00 Igor Sapego <isap...@gridgain.com>:

> Also, is it really a breaking change if the results are wrong?
> To me it looks more like a bugfix, i.e. you can't break something
> that does not work properly.
>
> Best Regards,
> Igor
>
> On Wed, Apr 12, 2017 at 2:04 PM, Andrey Mashenkov <
> andrey.mashen...@gmail.com> wrote:
>
> > Sergi,
> >
> > How can query to replicated cache leads to to wrong results?
> > Is it due to we can read backup entries?
> >
> > On Wed, Apr 12, 2017 at 12:31 PM, Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >
> > wrote:
> >
> > > Guys,
> > >
> > > I want to introduce another breaking change for 2.0.
> > >
> > > Currently SQL is being processed differently when we call method
> `query`
> > on
> > > partitioned cache and on replicated: on replicated cache we do not do
> any
> > > extra processing and execute the query as is on current node.
> > >
> > > This behavior historically existed for performance reasons. But it is
> not
> > > obvious and leads to wrong query results. This issue becomes even more
> > > creepy with JDBC and ODBC drivers.
> > >
> > > In 2.0 I want to execute all the SQL queries the same way through the
> > whole
> > > processing pipeline to guaranty the correct result irrespectively to
> the
> > > cache that was the query originator.
> > >
> > > To be able to have the old behavior (skip all the preprocessing and run
> > > query on current node) add a flag isReplicatedOnly() on SqlQuery and
> > > SqlFieldsQuery. It will be disabled by default and if one knows that
> the
> > > only replicated tables participate in a query, then he can enable it
> for
> > > better performance.
> > >
> > > Sergi
> > >
> >
> >
> >
> > --
> > Best regards,
> > Andrey V. Mashenkov
> >
>


Re: CREATE TABLE SQL command syntax

2017-04-12 Thread Sergi Vladykin
Sergey,

We've already discussed this and decided to have a cache per table, because
otherwise user will be forced to have unique keys across multiple
independent tables, which is bad.

Thus the idea with TABLESPACE does not really work for us.

Sergi

2017-04-12 13:15 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:

> I don't see how you make H2 Parser extendable, you will have to add plugin
> call to every *potentially* extendable place in it. In general this does
> not work. As H2 guy I would also reject patch like this.
>
> Sergi
>
> 2017-04-12 13:10 GMT+03:00 Alexander Paschenko <
> alexander.a.pasche...@gmail.com>:
>
>> Sergi,
>>
>> Please have a closer look at what I've written in my first post. I don't
>> see why we have to cling to H2 and its parsing modes all the time — after
>> all, we're just talking string processing now, aren't we? (Yes, complex
>> and
>> non trivial, but still.)
>>
>> What's wrong with idea of patching H2 to allow custom parsing? (With the
>> parsing itself living in Ignite code, obviously, not in H2.).
>>
>> What I propose is just to make H2's Parser class extendable and make H2
>> aware of its descendants via config params. And that's all with respect to
>> H2, nothing more.
>>
>> After that, on Ignite side we do all we want with our parser based on
>> theirs. It resembles story with custom types — first we make H2 extendable
>> in the way we need, then we introduce exact features we need on Ignite
>> side.
>>
>> — Alex
>>
>> среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
>>
>> > It definitely makes sense to add a separate mode for Ignite in H2.
>> Though
>> > it is wrong to think that it will allow us to add any crazy syntax we
>> want
>> > (and it is actually a wrong idea imo), only the minor variations of the
>> > existing syntax. But this must be enough.
>> >
>> > I believe we should end up with something like
>> >
>> > CREATE TABLE person
>> > (
>> >   id INT PRIMARY KEY,
>> >   orgId INT AFFINITY KEY,
>> >   name VARCHAR
>> > )
>> > WITH "cfg:my_config_template.xml"
>> >
>> > Sergi
>> >
>> >
>> > 2017-04-12 7:54 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
>> > <javascript:;>>:
>> >
>> > > Agree, the updated syntax looks better. One change though: KEY ->
>> PRIMARY
>> > > KEY.
>> > >
>> > > Sergi, what do you think?
>> > >
>> > > D.
>> > >
>> > > On Tue, Apr 11, 2017 at 9:50 PM, Pavel Tupitsyn <ptupit...@apache.org
>> > <javascript:;>>
>> > > wrote:
>> > >
>> > > > I think "WITH" syntax is ugly and cumbersome.
>> > > >
>> > > > We should go with this one:
>> > > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
>> > > > varchar, lastName varchar)
>> > > >
>> > > > All databases (i.e. [1], [2]) work this way, I see no reason to
>> invent
>> > > > something different and confuse the users.
>> > > >
>> > > > [1]
>> > > > https://docs.microsoft.com/en-us/sql/t-sql/statements/create
>> > > > -table-transact-sql#syntax-1
>> > > > [2] https://www.postgresql.org/docs/9.1/static/sql-createtable.html
>> > > >
>> > > > On Wed, Apr 12, 2017 at 6:12 AM, Alexander Paschenko <
>> > > > alexander.a.pasche...@gmail.com <javascript:;>> wrote:
>> > > >
>> > > > > Dmitry,
>> > > > >
>> > > > > For H2 it would be something like this - please note all those
>> > quotes,
>> > > > > commas and equality signs that would be mandatory:
>> > > > >
>> > > > > CREATE TABLE Person (id int, uid uuid, firstName varchar, lastName
>> > > > > varchar) WITH "keyFields=id,uuid","affinityKey=id"
>> > > > >
>> > > > > With suggested approach, it would be something like
>> > > > >
>> > > > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
>> > > > > varchar, lastName varchar)
>> > > > >
>> > > > > While this may not look like a drastic improvement in this
>> particular
>> > > > > case, we someday most likely will w

Re: CREATE TABLE SQL command syntax

2017-04-12 Thread Sergi Vladykin
I don't see how you make H2 Parser extendable, you will have to add plugin
call to every *potentially* extendable place in it. In general this does
not work. As H2 guy I would also reject patch like this.

Sergi

2017-04-12 13:10 GMT+03:00 Alexander Paschenko <
alexander.a.pasche...@gmail.com>:

> Sergi,
>
> Please have a closer look at what I've written in my first post. I don't
> see why we have to cling to H2 and its parsing modes all the time — after
> all, we're just talking string processing now, aren't we? (Yes, complex and
> non trivial, but still.)
>
> What's wrong with idea of patching H2 to allow custom parsing? (With the
> parsing itself living in Ignite code, obviously, not in H2.).
>
> What I propose is just to make H2's Parser class extendable and make H2
> aware of its descendants via config params. And that's all with respect to
> H2, nothing more.
>
> After that, on Ignite side we do all we want with our parser based on
> theirs. It resembles story with custom types — first we make H2 extendable
> in the way we need, then we introduce exact features we need on Ignite
> side.
>
> — Alex
>
> среда, 12 апреля 2017 г. пользователь Sergi Vladykin написал:
>
> > It definitely makes sense to add a separate mode for Ignite in H2. Though
> > it is wrong to think that it will allow us to add any crazy syntax we
> want
> > (and it is actually a wrong idea imo), only the minor variations of the
> > existing syntax. But this must be enough.
> >
> > I believe we should end up with something like
> >
> > CREATE TABLE person
> > (
> >   id INT PRIMARY KEY,
> >   orgId INT AFFINITY KEY,
> >   name VARCHAR
> > )
> > WITH "cfg:my_config_template.xml"
> >
> > Sergi
> >
> >
> > 2017-04-12 7:54 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
> > <javascript:;>>:
> >
> > > Agree, the updated syntax looks better. One change though: KEY ->
> PRIMARY
> > > KEY.
> > >
> > > Sergi, what do you think?
> > >
> > > D.
> > >
> > > On Tue, Apr 11, 2017 at 9:50 PM, Pavel Tupitsyn <ptupit...@apache.org
> > <javascript:;>>
> > > wrote:
> > >
> > > > I think "WITH" syntax is ugly and cumbersome.
> > > >
> > > > We should go with this one:
> > > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
> > > > varchar, lastName varchar)
> > > >
> > > > All databases (i.e. [1], [2]) work this way, I see no reason to
> invent
> > > > something different and confuse the users.
> > > >
> > > > [1]
> > > > https://docs.microsoft.com/en-us/sql/t-sql/statements/create
> > > > -table-transact-sql#syntax-1
> > > > [2] https://www.postgresql.org/docs/9.1/static/sql-createtable.html
> > > >
> > > > On Wed, Apr 12, 2017 at 6:12 AM, Alexander Paschenko <
> > > > alexander.a.pasche...@gmail.com <javascript:;>> wrote:
> > > >
> > > > > Dmitry,
> > > > >
> > > > > For H2 it would be something like this - please note all those
> > quotes,
> > > > > commas and equality signs that would be mandatory:
> > > > >
> > > > > CREATE TABLE Person (id int, uid uuid, firstName varchar, lastName
> > > > > varchar) WITH "keyFields=id,uuid","affinityKey=id"
> > > > >
> > > > > With suggested approach, it would be something like
> > > > >
> > > > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
> > > > > varchar, lastName varchar)
> > > > >
> > > > > While this may not look like a drastic improvement in this
> particular
> > > > > case, we someday most likely will want either an all-custom CREATE
> > > > > CACHE command, or a whole bunch of new options for CREATE TABLE, if
> > we
> > > > > decide not to go with CREATE CACHE - I personally think that stuff
> > > > > like
> > > > >
> > > > > CREATE TABLE ... WITH
> > > > > "keyFields=id,uuid","affinityKey=id","cacheType=
> > > partitioned","atomicity=
> > > > > atomic","partitions=3"
> > > > >
> > > > > which will arise if we continue to try to stuff everything into
> WITH
> > > > > will just bring more ugliness with time, and that's not to mention
&

SQL on PARTITIONED vs REPLICATED cache

2017-04-12 Thread Sergi Vladykin
Guys,

I want to introduce another breaking change for 2.0.

Currently SQL is being processed differently when we call method `query` on
partitioned cache and on replicated: on replicated cache we do not do any
extra processing and execute the query as is on current node.

This behavior historically existed for performance reasons. But it is not
obvious and leads to wrong query results. This issue becomes even more
creepy with JDBC and ODBC drivers.

In 2.0 I want to execute all the SQL queries the same way through the whole
processing pipeline to guaranty the correct result irrespectively to the
cache that was the query originator.

To be able to have the old behavior (skip all the preprocessing and run
query on current node) add a flag isReplicatedOnly() on SqlQuery and
SqlFieldsQuery. It will be disabled by default and if one knows that the
only replicated tables participate in a query, then he can enable it for
better performance.

Sergi


Re: Sorting fields of Binarilyzable objects on write

2017-04-12 Thread Sergi Vladykin
Vladimir,

I think we have to disallow conditional writes here, because DML should
write all the fields, no?

Sergi

2017-04-12 11:07 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Consider the following code:
>
> void writeBinary(BinaryWriter w) {
> w.writeBoolean("C", c);
>
> if (c)
> w.writeInt("A", a)
> else
> w.writeInt("B", b)
> }
>
> How are we going to force user to follow the contract in this case?
>
> On Wed, Apr 12, 2017 at 9:16 AM, Dmitriy Setrakyan <dsetrak...@apache.org>
> wrote:
>
> > I think it is OK for users to do their own sorting, but we should
> > definitely validate the correct order and throw an exception if it is
> not.
> >
> > D.
> >
> > On Tue, Apr 11, 2017 at 11:02 PM, Pavel Tupitsyn <ptupit...@apache.org>
> > wrote:
> >
> > > QueryEntity order is not only harder for the users, it will be
> nightmare
> > to
> > > implement.
> > > What if there is no QueryEntity defined? What if the same class is used
> > in
> > > multiple QueryEntity?
> > > I don't think serialization code has to be tied to QueryEntity in any
> > way,
> > > this violates separation of concerns.
> > >
> > > So I guess we can agree on sorting fields alphabetically.
> > >
> > > Let's get to the initial question:
> > > * Should we do the sorting for the user (performance hit)?
> > > * Should we at least validate user-defined order?
> > >
> > > On Wed, Apr 12, 2017 at 2:12 AM, Dmitriy Setrakyan <
> > dsetrak...@apache.org>
> > > wrote:
> > >
> > > > On Tue, Apr 11, 2017 at 1:28 PM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com>
> > > > wrote:
> > > >
> > > > > I'm just trying to understand the current state of things and
> risks.
> > > May
> > > > be
> > > > > we need to do some adjustments here before 2.0 to be on the safe
> > side.
> > > > >
> > > > > Actually looks like this not really important, we just have to
> > clearly
> > > > > document that DML builds keys this way and require from user to do
> > the
> > > > same
> > > > > to be able to use cache API.
> > > > >
> > > >
> > > >
> > > > I think it is important from the usability stand point. A user can
> > always
> > > > sort fields alphabetically in his or her mind. However, trying to
> > > remember
> > > > the field order from some QueryEntity is a lot harder.
> > > >
> > >
> >
>


Re: CREATE TABLE SQL command syntax

2017-04-12 Thread Sergi Vladykin
It definitely makes sense to add a separate mode for Ignite in H2. Though
it is wrong to think that it will allow us to add any crazy syntax we want
(and it is actually a wrong idea imo), only the minor variations of the
existing syntax. But this must be enough.

I believe we should end up with something like

CREATE TABLE person
(
  id INT PRIMARY KEY,
  orgId INT AFFINITY KEY,
  name VARCHAR
)
WITH "cfg:my_config_template.xml"

Sergi


2017-04-12 7:54 GMT+03:00 Dmitriy Setrakyan :

> Agree, the updated syntax looks better. One change though: KEY -> PRIMARY
> KEY.
>
> Sergi, what do you think?
>
> D.
>
> On Tue, Apr 11, 2017 at 9:50 PM, Pavel Tupitsyn 
> wrote:
>
> > I think "WITH" syntax is ugly and cumbersome.
> >
> > We should go with this one:
> > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
> > varchar, lastName varchar)
> >
> > All databases (i.e. [1], [2]) work this way, I see no reason to invent
> > something different and confuse the users.
> >
> > [1]
> > https://docs.microsoft.com/en-us/sql/t-sql/statements/create
> > -table-transact-sql#syntax-1
> > [2] https://www.postgresql.org/docs/9.1/static/sql-createtable.html
> >
> > On Wed, Apr 12, 2017 at 6:12 AM, Alexander Paschenko <
> > alexander.a.pasche...@gmail.com> wrote:
> >
> > > Dmitry,
> > >
> > > For H2 it would be something like this - please note all those quotes,
> > > commas and equality signs that would be mandatory:
> > >
> > > CREATE TABLE Person (id int, uid uuid, firstName varchar, lastName
> > > varchar) WITH "keyFields=id,uuid","affinityKey=id"
> > >
> > > With suggested approach, it would be something like
> > >
> > > CREATE TABLE Person (id int AFFINITY KEY, uid uuid KEY, firstName
> > > varchar, lastName varchar)
> > >
> > > While this may not look like a drastic improvement in this particular
> > > case, we someday most likely will want either an all-custom CREATE
> > > CACHE command, or a whole bunch of new options for CREATE TABLE, if we
> > > decide not to go with CREATE CACHE - I personally think that stuff
> > > like
> > >
> > > CREATE TABLE ... WITH
> > > "keyFields=id,uuid","affinityKey=id","cacheType=
> partitioned","atomicity=
> > > atomic","partitions=3"
> > >
> > > which will arise if we continue to try to stuff everything into WITH
> > > will just bring more ugliness with time, and that's not to mention
> > > that new CREATE CACHE syntax will be impossible or relatively hard to
> > > introduce as we will have to approve it with H2 folks, and that's how
> > > it will be with any new param or command that we want.
> > >
> > > Allowing to plug custom parser into H2 (as we do now with table
> > > engine) will let us introduce any syntax we want and focus on
> > > usability and not on compromises and workarounds (which WITH keyword
> > > currently is).
> > >
> > > - Alex
> > >
> > > 2017-04-12 5:11 GMT+03:00 Dmitriy Setrakyan :
> > > > Alexeander,
> > > >
> > > > Can you please provide an example of what the CREATE TABLE command
> > would
> > > > look like if we use WITH syntax from H2 vs. what you are proposing?
> > > >
> > > > D.
> > > >
> > > > On Tue, Apr 11, 2017 at 6:35 PM, Alexander Paschenko <
> > > > alexander.a.pasche...@gmail.com> wrote:
> > > >
> > > >> Hello Igniters,
> > > >>
> > > >> Yup, it's THAT time once again as we haven't ultimately settled on
> > > >> anything with the subj. as of yet, but I believe that now with DDL
> on
> > > >> its way this talk can't be avoided anymore (sorry guys).
> > > >>
> > > >> The last time we talked about Ignite specific stuff we need to have
> in
> > > >> CREATE TABLE (key fields list, affinity key, am I missing
> anything?),
> > > >> the simplest approach suggested by Sergi was that we simply use WITH
> > > >> part of H2's CREATE TABLE to pass stuff we need.
> > > >>
> > > >> This could work, but needless to say that such commands would look
> > plain
> > > >> ugly.
> > > >>
> > > >> I think we should go with custom syntax after all, BUT not in a way
> > > >> suggested before by Sergi (propose Apache Ignite mode to H2).
> > > >>
> > > >> Instead, I suggest that we propose to H2 patch that would allow
> > > >> plugging in *custom SQL parser* directly based on theirs (quite
> > > >> elegant one) – I've had a look at their code, and this should not be
> > > >> hard.
> > > >>
> > > >> Work on such a patch making syntax parsing overridable would take a
> > > >> couple days which is not much time AND would give us the opportunity
> > > >> to introduce to Ignite virtually any syntax we wish - both now and
> in
> > > >> the future. Without worrying about compatibility with H2 ever again,
> > > >> that is.
> > > >>
> > > >> Thoughts? After we agree on this principally and after H2 patch for
> > > >> custom parsing is ready, we can roll our sleeves and focus on syntax
> > > >> itself.
> > > >>
> > > >> - Alex
> > > >>
> > >
> >
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-11 Thread Sergi Vladykin
I'm just trying to understand the current state of things and risks. May be
we need to do some adjustments here before 2.0 to be on the safe side.

Actually looks like this not really important, we just have to clearly
document that DML builds keys this way and require from user to do the same
to be able to use cache API.

Sergi

2017-04-11 21:47 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Sergi, why do you not like the alphabetical order? Seems like it would be a
> much easier to understand requirement from the user standpoint. Do you see
> some issues here?
>
> On Mon, Apr 10, 2017 at 10:43 AM, Sergi Vladykin <sergi.vlady...@gmail.com
> >
> wrote:
>
> > I'm sorry, looks like I do not really well understand this stuff, but it
> is
> > still not clear to me why wouldn't we just take the order of key fields
> > given in QueryEntity and use it for both cases irrespectively to the
> order
> > of fields in regular Class or in Binarylizable?
> >
> > I mean lets say we have a Class (with unpredictable order of fields
> > according to reflection):
> >
> >Person implements Serializable
> >   {int age, double salary}
> >
> > Or we have a Binarylizable (with some unknown user defined order of
> > fields):
> >
> >   Person implements Binarylizable {
> >  writeBinary(w) {w.writeDouble("salary", salary),  w.writeInt(''age",
> > age)}
> >
> > Also we have a QueryEntity (age, salary)
> >
> > Why we can not take key fields names from QueryEntity in the given order
> > (age, salary) and get values from either "regular" Class or from
> > Binarylizable and calculate hash code?
> >
> > Sergi
> >
> > 2017-04-10 19:56 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> >
> > > Guys,
> > >
> > > The problem is that order of fields serialization is unknown for
> regular
> > > objects, where "regular" stands for non-Binarilyzable class. Reflection
> > > returns fields in unpredictable order. For this reason you cannot match
> > > fields order between class and QueryEntity. This is why we introduced
> > > sorting, and this is why idea to rely on QueryEntity doesn't work.
> > >
> > > On Mon, Apr 10, 2017 at 7:01 PM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > Why "regular" are different here?
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-10 18:59 GMT+03:00 Pavel Tupitsyn <ptupit...@gridgain.com>:
> > > >
> > > > > QueryEntity sorting is not an option for "regular" classes with
> > > > reflective
> > > > > serialization.
> > > > > We have to use alphabetical.
> > > > > Also, same class can participate in multiple query entities.
> > > > >
> > > > > 10 апр. 2017 г. 18:52 пользователь "Dmitriy Setrakyan" <
> > > > > dsetrak...@apache.org> написал:
> > > > >
> > > > > On Mon, Apr 10, 2017 at 8:28 AM, Sergi Vladykin <
> > > > sergi.vlady...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > The decision to use alphabetic order looks strange here. Using
> > order
> > > > > > provided in QueryEntity and require from user to have the same
> > order
> > > > > > in Binarylizable
> > > > > > looks more reasonable.
> > > > > >
> > > > >
> > > > > I think this would be much harder to verify. Alphabetical order is
> > more
> > > > > intuitive, no?
> > > > >
> > > >
> > >
> >
>


Re: AffinityKeyMapper: break compatibility before 2.0

2017-04-11 Thread Sergi Vladykin
+1 to Vladimir.

Sergi

2017-04-11 10:48 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Dima,
>
> The whole idea of AffinityKeyMapper appears to be wrong since we will have
> only BinaryMarshaller. We do not have classes on server, how can we rely on
> interface this class extends? I think we should do the following:
> 1) Allow @AffinityKeyMapped annotation on fields only (it doesn't work on
> methods with binary anyway).
> 2) Drop AffinityKeyMapper completely.
> 3) Hopefully, at some point we will implement old Yakov's idea of
> declarative extensions to binary objects, which will handle "affintiy key",
> "equals/hashCode" and "compareTo" cases without necessity to have any
> interface implementation classes on the server.
>
> Thoughts?
>
> On Tue, Apr 11, 2017 at 10:41 AM, Dmitriy Setrakyan <dsetrak...@apache.org
> >
> wrote:
>
> > I agree that this interface is problematic. However, I don't think that
> > dropping it right away would be fair to our users. Can we deprecate it
> and
> > print out a warning that AffinityKeyMapper cannot be used with DDL
> > statements?
> >
> > D.
> >
> > On Tue, Apr 11, 2017 at 12:32 AM, Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >
> > wrote:
> >
> > > Guys,
> > >
> > > We are moving in direction of better distributed SQL support, it means
> > that
> > > we always will need to know an affinity field name for key type.
> > >
> > > Now we have AffinityKeyMapper which hides it from us.
> > >
> > > Since we have other means of configuring the affinity key, e.g
> > > CacheKeyConfiguration and @AffinityKeyMapped, I suggest to either drop
> > this
> > > interface or change it so that it just return affinity key field name
> > > instead of the affinity key field value. I prefer dropping it.
> > >
> > > What do you think?
> > >
> > > Sergi
> > >
> >
>


AffinityKeyMapper: break compatibility before 2.0

2017-04-11 Thread Sergi Vladykin
Guys,

We are moving in direction of better distributed SQL support, it means that
we always will need to know an affinity field name for key type.

Now we have AffinityKeyMapper which hides it from us.

Since we have other means of configuring the affinity key, e.g
CacheKeyConfiguration and @AffinityKeyMapped, I suggest to either drop this
interface or change it so that it just return affinity key field name
instead of the affinity key field value. I prefer dropping it.

What do you think?

Sergi


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
I'm sorry, looks like I do not really well understand this stuff, but it is
still not clear to me why wouldn't we just take the order of key fields
given in QueryEntity and use it for both cases irrespectively to the order
of fields in regular Class or in Binarylizable?

I mean lets say we have a Class (with unpredictable order of fields
according to reflection):

   Person implements Serializable
  {int age, double salary}

Or we have a Binarylizable (with some unknown user defined order of fields):

  Person implements Binarylizable {
 writeBinary(w) {w.writeDouble("salary", salary),  w.writeInt(''age",
age)}

Also we have a QueryEntity (age, salary)

Why we can not take key fields names from QueryEntity in the given order
(age, salary) and get values from either "regular" Class or from
Binarylizable and calculate hash code?

Sergi

2017-04-10 19:56 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Guys,
>
> The problem is that order of fields serialization is unknown for regular
> objects, where "regular" stands for non-Binarilyzable class. Reflection
> returns fields in unpredictable order. For this reason you cannot match
> fields order between class and QueryEntity. This is why we introduced
> sorting, and this is why idea to rely on QueryEntity doesn't work.
>
> On Mon, Apr 10, 2017 at 7:01 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Why "regular" are different here?
> >
> > Sergi
> >
> > 2017-04-10 18:59 GMT+03:00 Pavel Tupitsyn <ptupit...@gridgain.com>:
> >
> > > QueryEntity sorting is not an option for "regular" classes with
> > reflective
> > > serialization.
> > > We have to use alphabetical.
> > > Also, same class can participate in multiple query entities.
> > >
> > > 10 апр. 2017 г. 18:52 пользователь "Dmitriy Setrakyan" <
> > > dsetrak...@apache.org> написал:
> > >
> > > On Mon, Apr 10, 2017 at 8:28 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > The decision to use alphabetic order looks strange here. Using order
> > > > provided in QueryEntity and require from user to have the same order
> > > > in Binarylizable
> > > > looks more reasonable.
> > > >
> > >
> > > I think this would be much harder to verify. Alphabetical order is more
> > > intuitive, no?
> > >
> >
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
Why "regular" are different here?

Sergi

2017-04-10 18:59 GMT+03:00 Pavel Tupitsyn <ptupit...@gridgain.com>:

> QueryEntity sorting is not an option for "regular" classes with reflective
> serialization.
> We have to use alphabetical.
> Also, same class can participate in multiple query entities.
>
> 10 апр. 2017 г. 18:52 пользователь "Dmitriy Setrakyan" <
> dsetrak...@apache.org> написал:
>
> On Mon, Apr 10, 2017 at 8:28 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > The decision to use alphabetic order looks strange here. Using order
> > provided in QueryEntity and require from user to have the same order
> > in Binarylizable
> > looks more reasonable.
> >
>
> I think this would be much harder to verify. Alphabetical order is more
> intuitive, no?
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
I even think that in DML we have to just calculate hashCode in order of
QueryEntity regardless of order in Binarylizable.

Sergi

2017-04-10 18:29 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Does anyone disagree about having users sort the fields themselves, as
> Sergi and I suggested above?
>
> On Mon, Apr 10, 2017 at 8:21 AM, Pavel Tupitsyn <ptupit...@apache.org>
> wrote:
>
> > Sergi, DML writes fields in alphabetic order and computes hash code
> > accordingly.
> > If user-defined Binarylizable impl uses different order, hash codes will
> be
> > inconsistent.
> >
> > On Mon, Apr 10, 2017 at 6:18 PM, Sergi Vladykin <
> sergi.vlady...@gmail.com>
> > wrote:
> >
> > > What is correct or incorrect ordering for DML?
> > >
> > > Sergi
> > >
> > > 2017-04-10 18:14 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> > >
> > > > I would agree that it should be on a user to always sort the fields,
> if
> > > we
> > > > make it a part of the contract. However, in this case, we should
> always
> > > > throw exception if user somehow provides fields in the wrong order.
> > > >
> > > > D.
> > > >
> > > > On Mon, Apr 10, 2017 at 8:07 AM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com>
> > > > wrote:
> > > >
> > > > > Could you please elaborate why would we want to sort fields in
> > > > > Binarilyzable
> > > > > classes?
> > > > >
> > > > > If you are taking from stable binary representation perspective,
> > then I
> > > > > think it is a problem of user, but not ours.
> > > > >
> > > > > Sergi
> > > > >
> > > > > 2017-04-10 17:53 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > > > >
> > > > > > Zapalniki,
> > > > > >
> > > > > > Inspired by IGNITE-4669 (.NET: Sort binary object fields) [1].
> > > > > >
> > > > > > Currently we sort binary object fields before when writing them
> to
> > > the
> > > > > > output stream in case of standard (Serializable) objects and
> > > > > > BinaryObjectBuilder. This makes sense as we have stable binary
> > object
> > > > > > representation irrespective of fields order, which is very
> > important
> > > > e.g.
> > > > > > for DML. And it works fine from performance perspective as well:
> > > > > > - For standard classes we sort fields only once during
> > > initialization;
> > > > > > - For builder we have to maintain the whole object graph in
> memory
> > > > before
> > > > > > writing anyway as builder is mutable, so sorting doesn't impose
> > > serious
> > > > > > performance hit.
> > > > > >
> > > > > > But what to do with Binarilyzable classes? We can sort their
> fields
> > > as
> > > > > > well, but it means that:
> > > > > > 1) We will not be able to write them directly to stream. Instead,
> > we
> > > > will
> > > > > > accumulate values in memory, and write only when the whole object
> > > graph
> > > > > is
> > > > > > known.
> > > > > > 2) Currently reads are mostly sequential from memory perspective.
> > > With
> > > > > this
> > > > > > change reads will become random.
> > > > > >
> > > > > > So we will loose both read and write serialization performance.
> How
> > > do
> > > > > you
> > > > > > think - do we need this change or not?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-4669
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
The decision to use alphabetic order looks strange here. Using order
provided in QueryEntity and require from user to have the same order
in Binarylizable
looks more reasonable.

Sergi

2017-04-10 18:21 GMT+03:00 Pavel Tupitsyn <ptupit...@apache.org>:

> Sergi, DML writes fields in alphabetic order and computes hash code
> accordingly.
> If user-defined Binarylizable impl uses different order, hash codes will be
> inconsistent.
>
> On Mon, Apr 10, 2017 at 6:18 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > What is correct or incorrect ordering for DML?
> >
> > Sergi
> >
> > 2017-04-10 18:14 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > I would agree that it should be on a user to always sort the fields, if
> > we
> > > make it a part of the contract. However, in this case, we should always
> > > throw exception if user somehow provides fields in the wrong order.
> > >
> > > D.
> > >
> > > On Mon, Apr 10, 2017 at 8:07 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > Could you please elaborate why would we want to sort fields in
> > > > Binarilyzable
> > > > classes?
> > > >
> > > > If you are taking from stable binary representation perspective,
> then I
> > > > think it is a problem of user, but not ours.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-10 17:53 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > > >
> > > > > Zapalniki,
> > > > >
> > > > > Inspired by IGNITE-4669 (.NET: Sort binary object fields) [1].
> > > > >
> > > > > Currently we sort binary object fields before when writing them to
> > the
> > > > > output stream in case of standard (Serializable) objects and
> > > > > BinaryObjectBuilder. This makes sense as we have stable binary
> object
> > > > > representation irrespective of fields order, which is very
> important
> > > e.g.
> > > > > for DML. And it works fine from performance perspective as well:
> > > > > - For standard classes we sort fields only once during
> > initialization;
> > > > > - For builder we have to maintain the whole object graph in memory
> > > before
> > > > > writing anyway as builder is mutable, so sorting doesn't impose
> > serious
> > > > > performance hit.
> > > > >
> > > > > But what to do with Binarilyzable classes? We can sort their fields
> > as
> > > > > well, but it means that:
> > > > > 1) We will not be able to write them directly to stream. Instead,
> we
> > > will
> > > > > accumulate values in memory, and write only when the whole object
> > graph
> > > > is
> > > > > known.
> > > > > 2) Currently reads are mostly sequential from memory perspective.
> > With
> > > > this
> > > > > change reads will become random.
> > > > >
> > > > > So we will loose both read and write serialization performance. How
> > do
> > > > you
> > > > > think - do we need this change or not?
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-4669
> > > > >
> > > >
> > >
> >
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
What is correct or incorrect ordering for DML?

Sergi

2017-04-10 18:14 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> I would agree that it should be on a user to always sort the fields, if we
> make it a part of the contract. However, in this case, we should always
> throw exception if user somehow provides fields in the wrong order.
>
> D.
>
> On Mon, Apr 10, 2017 at 8:07 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Could you please elaborate why would we want to sort fields in
> > Binarilyzable
> > classes?
> >
> > If you are taking from stable binary representation perspective, then I
> > think it is a problem of user, but not ours.
> >
> > Sergi
> >
> > 2017-04-10 17:53 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> >
> > > Zapalniki,
> > >
> > > Inspired by IGNITE-4669 (.NET: Sort binary object fields) [1].
> > >
> > > Currently we sort binary object fields before when writing them to the
> > > output stream in case of standard (Serializable) objects and
> > > BinaryObjectBuilder. This makes sense as we have stable binary object
> > > representation irrespective of fields order, which is very important
> e.g.
> > > for DML. And it works fine from performance perspective as well:
> > > - For standard classes we sort fields only once during initialization;
> > > - For builder we have to maintain the whole object graph in memory
> before
> > > writing anyway as builder is mutable, so sorting doesn't impose serious
> > > performance hit.
> > >
> > > But what to do with Binarilyzable classes? We can sort their fields as
> > > well, but it means that:
> > > 1) We will not be able to write them directly to stream. Instead, we
> will
> > > accumulate values in memory, and write only when the whole object graph
> > is
> > > known.
> > > 2) Currently reads are mostly sequential from memory perspective. With
> > this
> > > change reads will become random.
> > >
> > > So we will loose both read and write serialization performance. How do
> > you
> > > think - do we need this change or not?
> > >
> > > Vladimir.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-4669
> > >
> >
>


Re: Sorting fields of Binarilyzable objects on write

2017-04-10 Thread Sergi Vladykin
Could you please elaborate why would we want to sort fields in Binarilyzable
classes?

If you are taking from stable binary representation perspective, then I
think it is a problem of user, but not ours.

Sergi

2017-04-10 17:53 GMT+03:00 Vladimir Ozerov :

> Zapalniki,
>
> Inspired by IGNITE-4669 (.NET: Sort binary object fields) [1].
>
> Currently we sort binary object fields before when writing them to the
> output stream in case of standard (Serializable) objects and
> BinaryObjectBuilder. This makes sense as we have stable binary object
> representation irrespective of fields order, which is very important e.g.
> for DML. And it works fine from performance perspective as well:
> - For standard classes we sort fields only once during initialization;
> - For builder we have to maintain the whole object graph in memory before
> writing anyway as builder is mutable, so sorting doesn't impose serious
> performance hit.
>
> But what to do with Binarilyzable classes? We can sort their fields as
> well, but it means that:
> 1) We will not be able to write them directly to stream. Instead, we will
> accumulate values in memory, and write only when the whole object graph is
> known.
> 2) Currently reads are mostly sequential from memory perspective. With this
> change reads will become random.
>
> So we will loose both read and write serialization performance. How do you
> think - do we need this change or not?
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4669
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
Absolutely agree, lets get some numbers on RendezvousAffinity with both
variants: useBalancer enabled and disabled. Taras, can you provide them?

Anyways at the moment we need to make a decision on what will get into 2.0.
I'm for dropping (or hiding) all the suspicious stuff and adding it back if
we fix it. Thus I'm going to move FairAffinity into private package now.

Sergi

2017-04-10 16:55 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Sergi,
>
> AFAIK the only reason why RendezvousAffinity is used by default is that
> behavior on rebalance is no less important than steady state performance,
> especially on large deployments and cloud environments, when nodes
> constantly joins and leaves topology. Let's stop guessing and discuss the
> numbers - how many partitions reassignments happen with new
> RendezvousAffinity flavor? I haven't seen any results so far.
>
> On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura <ag...@apache.org> wrote:
>
> > Guys,
> >
> > It seems that both mentioned problem have the same root cause: each
> > cache has personal affinity function instance and it leads to
> > perfromance problem (we retry the same calcualtions for each cache)
> > and problem related with fact that FailAffinityFunction is statefull
> > (some co-located cache has different assignment if it was started on
> > different topology).
> >
> > Obvious solution is the same affinity for some cache set. As result
> > all caches from one set will use the same assignment that will be
> > calculated exactly once and will not depend on cache start topology.
> >
> >
> >
> >
> >
> >
> > On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
> > <sergi.vlady...@gmail.com> wrote:
> > > As for default value for useBalancer flag, I agree with Yakov, it must
> be
> > > enabled by default. Because performance in steady state is usually more
> > > important than performance of rebalancing. For edge cases it can be
> > > disabled.
> > >
> > > Sergi
> > >
> > > 2017-04-10 15:04 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> > >
> > >> If the RendezvousAffinity with enabled useBalancer is not much worse
> > than
> > >> FairAffinity, I see no reason to keep the latter.
> > >>
> > >> Sergi
> > >>
> > >> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > >>
> > >>> Guys,
> > >>>
> > >>> We should not have it enabled by default because as Taras mentioned:
> > "but
> > >>> in this case there is not guarantee that a partition doesn't move
> from
> > one
> > >>> node to another when node leave topology". Let's avoid any rush here.
> > >>> There
> > >>> is nothing terribly wrong with FairAffinity. It is not enabled by
> > default
> > >>> and at the very least we can always mark it as deprecated. It is
> > better to
> > >>> test rigorously rendezvous affinity first in terms of partition
> > >>> distribution and partition migration and decide whether results are
> > >>> acceptable.
> > >>>
> > >>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov <yzhda...@apache.org
> >
> > >>> wrote:
> > >>>
> > >>> > We should have it enabled by default.
> > >>> >
> > >>> > --Yakov
> > >>> >
> > >>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >:
> > >>> >
> > >>> > > Why wouldn't we have useBalancer always enabled?
> > >>> > >
> > >>> > > Sergi
> > >>> > >
> > >>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov <tled...@gridgain.com>:
> > >>> > >
> > >>> > > > Folks,
> > >>> > > >
> > >>> > > > I worked on issue https://issues.apache.org/
> > jira/browse/IGNITE-3018
> > >>> > that
> > >>> > > > is related to performance of Rendezvous AF.
> > >>> > > >
> > >>> > > > But Wang/Jenkins hash integer hash distribution is worse then
> > MD5.
> > >>> So,
> > >>> > i
> > >>> > > > try to use simple partition balancer close
> > >>> > > > to Fair AF for Rendezvous AF.
> > >>> > >

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
As for default value for useBalancer flag, I agree with Yakov, it must be
enabled by default. Because performance in steady state is usually more
important than performance of rebalancing. For edge cases it can be
disabled.

Sergi

2017-04-10 15:04 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:

> If the RendezvousAffinity with enabled useBalancer is not much worse than
> FairAffinity, I see no reason to keep the latter.
>
> Sergi
>
> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
>
>> Guys,
>>
>> We should not have it enabled by default because as Taras mentioned: "but
>> in this case there is not guarantee that a partition doesn't move from one
>> node to another when node leave topology". Let's avoid any rush here.
>> There
>> is nothing terribly wrong with FairAffinity. It is not enabled by default
>> and at the very least we can always mark it as deprecated. It is better to
>> test rigorously rendezvous affinity first in terms of partition
>> distribution and partition migration and decide whether results are
>> acceptable.
>>
>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov <yzhda...@apache.org>
>> wrote:
>>
>> > We should have it enabled by default.
>> >
>> > --Yakov
>> >
>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
>> >
>> > > Why wouldn't we have useBalancer always enabled?
>> > >
>> > > Sergi
>> > >
>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov <tled...@gridgain.com>:
>> > >
>> > > > Folks,
>> > > >
>> > > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
>> > that
>> > > > is related to performance of Rendezvous AF.
>> > > >
>> > > > But Wang/Jenkins hash integer hash distribution is worse then MD5.
>> So,
>> > i
>> > > > try to use simple partition balancer close
>> > > > to Fair AF for Rendezvous AF.
>> > > >
>> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
>> > > > /secure/attachment/12858701/004.png
>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>> > > > Wang/Jenkins hash with partition balancer:
>> > > https://issues.apache.org/jira
>> > > > /secure/attachment/12858690/balanced.004.png
>> > > >
>> > > > When the balancer is enabled the distribution of partitions by nodes
>> > > looks
>> > > > like close to even distribution
>> > > > but in this case there is not guarantee that a partition doesn't
>> move
>> > > from
>> > > > one node to another
>> > > > when node leave topology.
>> > > > It is not guarantee but we try to minimize it because sorted array
>> of
>> > > > nodes is used (like in for pure-Rendezvous AF).
>> > > >
>> > > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
>> > > > instead of Fair AF.
>> > > >
>> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
>> > > >
>> > > >> What is the replacement for FairAffinityFunction?
>> > > >>
>> > > >> Generally I agree. If FairAffinityFunction can't be changed to
>> provide
>> > > >> consistent mapping, it should be dropped.
>> > > >>
>> > > >> -Val
>> > > >>
>> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
>> > > sergi.vlady...@gmail.com
>> > > >> <mailto:sergi.vlady...@gmail.com>> wrote:
>> > > >>
>> > > >> Guys,
>> > > >>
>> > > >> It appeared that our FairAffinityFunction can assign the same
>> > > >> partitions to
>> > > >> different nodes for different caches.
>> > > >>
>> > > >> It basically means that there is no collocation between the
>> caches
>> > > >> at all
>> > > >> even if they have the same affinity.
>> > > >>
>> > > >> As a result all SQL joins will not work (even collocated ones),
>> > > other
>> > > >> operations that rely on cache collocation will be either
>> broken or
>> > > >> work
>> > > >> slower, than expected.
>> > > >>
>> > > >> All this stuff is really non-obvious. And I see no reason why
>> we
>> > > >> should
>> > > >> allow that. I suggest to prohibit this behavior and drop
>> > > >> FairAffinityFunction before 2.0. We have to clearly document
>> that
>> > > >> the same
>> > > >> affinity function must provide the same partition assignments
>> for
>> > > >> all the
>> > > >> caches.
>> > > >>
>> > > >> Also I know that Taras Ledkov was working on a decent stateless
>> > > >> replacement
>> > > >> for FairAffinity, so we should not loose anything here.
>> > > >>
>> > > >> Thoughts?
>> > > >>
>> > > >> Sergi
>> > > >>
>> > > >>
>> > > >>
>> > > > --
>> > > > Taras Ledkov
>> > > > Mail-To: tled...@gridgain.com
>> > > >
>> > > >
>> > >
>> >
>>
>
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
If the RendezvousAffinity with enabled useBalancer is not much worse than
FairAffinity, I see no reason to keep the latter.

Sergi

2017-04-10 13:00 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Guys,
>
> We should not have it enabled by default because as Taras mentioned: "but
> in this case there is not guarantee that a partition doesn't move from one
> node to another when node leave topology". Let's avoid any rush here. There
> is nothing terribly wrong with FairAffinity. It is not enabled by default
> and at the very least we can always mark it as deprecated. It is better to
> test rigorously rendezvous affinity first in terms of partition
> distribution and partition migration and decide whether results are
> acceptable.
>
> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov <yzhda...@apache.org>
> wrote:
>
> > We should have it enabled by default.
> >
> > --Yakov
> >
> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> >
> > > Why wouldn't we have useBalancer always enabled?
> > >
> > > Sergi
> > >
> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov <tled...@gridgain.com>:
> > >
> > > > Folks,
> > > >
> > > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
> > that
> > > > is related to performance of Rendezvous AF.
> > > >
> > > > But Wang/Jenkins hash integer hash distribution is worse then MD5.
> So,
> > i
> > > > try to use simple partition balancer close
> > > > to Fair AF for Rendezvous AF.
> > > >
> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > > Wang/Jenkins hash: https://issues.apache.org/jira
> > > > /secure/attachment/12858701/004.png
> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > > Wang/Jenkins hash with partition balancer:
> > > https://issues.apache.org/jira
> > > > /secure/attachment/12858690/balanced.004.png
> > > >
> > > > When the balancer is enabled the distribution of partitions by nodes
> > > looks
> > > > like close to even distribution
> > > > but in this case there is not guarantee that a partition doesn't move
> > > from
> > > > one node to another
> > > > when node leave topology.
> > > > It is not guarantee but we try to minimize it because sorted array of
> > > > nodes is used (like in for pure-Rendezvous AF).
> > > >
> > > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> > > > instead of Fair AF.
> > > >
> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
> > > >
> > > >> What is the replacement for FairAffinityFunction?
> > > >>
> > > >> Generally I agree. If FairAffinityFunction can't be changed to
> provide
> > > >> consistent mapping, it should be dropped.
> > > >>
> > > >> -Val
> > > >>
> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com
> > > >> <mailto:sergi.vlady...@gmail.com>> wrote:
> > > >>
> > > >> Guys,
> > > >>
> > > >> It appeared that our FairAffinityFunction can assign the same
> > > >> partitions to
> > > >> different nodes for different caches.
> > > >>
> > > >> It basically means that there is no collocation between the
> caches
> > > >> at all
> > > >> even if they have the same affinity.
> > > >>
> > > >> As a result all SQL joins will not work (even collocated ones),
> > > other
> > > >> operations that rely on cache collocation will be either broken
> or
> > > >> work
> > > >> slower, than expected.
> > > >>
> > > >> All this stuff is really non-obvious. And I see no reason why we
> > > >> should
> > > >> allow that. I suggest to prohibit this behavior and drop
> > > >> FairAffinityFunction before 2.0. We have to clearly document
> that
> > > >> the same
> > > >> affinity function must provide the same partition assignments
> for
> > > >> all the
> > > >> caches.
> > > >>
> > > >> Also I know that Taras Ledkov was working on a decent stateless
> > > >> replacement
> > > >> for FairAffinity, so we should not loose anything here.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> Sergi
> > > >>
> > > >>
> > > >>
> > > > --
> > > > Taras Ledkov
> > > > Mail-To: tled...@gridgain.com
> > > >
> > > >
> > >
> >
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
Why wouldn't we have useBalancer always enabled?

Sergi

2017-04-10 12:31 GMT+03:00 Taras Ledkov <tled...@gridgain.com>:

> Folks,
>
> I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018 that
> is related to performance of Rendezvous AF.
>
> But Wang/Jenkins hash integer hash distribution is worse then MD5. So, i
> try to use simple partition balancer close
> to Fair AF for Rendezvous AF.
>
> Take a look at the heatmaps of distributions at the issue. e.g.:
> - Compare of current Rendezvous AF and new Rendezvous AF based of
> Wang/Jenkins hash: https://issues.apache.org/jira
> /secure/attachment/12858701/004.png
> - Compare of current Rendezvous AF and new Rendezvous AF based of
> Wang/Jenkins hash with partition balancer: https://issues.apache.org/jira
> /secure/attachment/12858690/balanced.004.png
>
> When the balancer is enabled the distribution of partitions by nodes looks
> like close to even distribution
> but in this case there is not guarantee that a partition doesn't move from
> one node to another
> when node leave topology.
> It is not guarantee but we try to minimize it because sorted array of
> nodes is used (like in for pure-Rendezvous AF).
>
> I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> instead of Fair AF.
>
> On 09.04.2017 14:12, Valentin Kulichenko wrote:
>
>> What is the replacement for FairAffinityFunction?
>>
>> Generally I agree. If FairAffinityFunction can't be changed to provide
>> consistent mapping, it should be dropped.
>>
>> -Val
>>
>> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <sergi.vlady...@gmail.com
>> <mailto:sergi.vlady...@gmail.com>> wrote:
>>
>> Guys,
>>
>> It appeared that our FairAffinityFunction can assign the same
>> partitions to
>> different nodes for different caches.
>>
>> It basically means that there is no collocation between the caches
>> at all
>> even if they have the same affinity.
>>
>> As a result all SQL joins will not work (even collocated ones), other
>> operations that rely on cache collocation will be either broken or
>> work
>> slower, than expected.
>>
>> All this stuff is really non-obvious. And I see no reason why we
>> should
>> allow that. I suggest to prohibit this behavior and drop
>> FairAffinityFunction before 2.0. We have to clearly document that
>> the same
>> affinity function must provide the same partition assignments for
>> all the
>> caches.
>>
>> Also I know that Taras Ledkov was working on a decent stateless
>> replacement
>> for FairAffinity, so we should not loose anything here.
>>
>> Thoughts?
>>
>> Sergi
>>
>>
>>
> --
> Taras Ledkov
> Mail-To: tled...@gridgain.com
>
>


null Cache names

2017-04-10 Thread Sergi Vladykin
Who can pickup this one? We should do it before 2.0

https://issues.apache.org/jira/browse/IGNITE-3488

Sergi


Re: IGNITE-4878 ready for review

2017-04-10 Thread Sergi Vladykin
Michael,

This code is already changed in my branch for lazy SQL, so I will not merge
your PR.
Thanks for pointing out the issue anyways!

Sergi

2017-04-07 20:00 GMT+03:00 michael.griggs :

> This change is https://issues.apache.org/jira/browse/IGNITE-4878
>
> IgniteH2Indexing can throw java.util.ConcurrentModificationException
>
> Please can you review this (one-line) change.  Thanks!
>
>
>
> --
> View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/IGNITE-4878-ready-for-
> review-tp16007p16284.html
> Sent from the Apache Ignite Developers mailing list archive at Nabble.com.
>


Re: Stable binary key representation

2017-04-09 Thread Sergi Vladykin
I guess Resolvers were added to DML just because they already existed since
1.9 and we were forced to support them in all the parts of our product.

We have to stop this practice to add features without clear real life use
cases for them.

Sergi

2017-04-09 17:00 GMT+03:00 Denis Magda <dma...@gridgain.com>:

> Sergi, Vovan,
>
> Sorry for being annoying but I still didn't get an answer on whether the
> resolvers are the must for DML. The main reason why we made them up some
> time ago is to support specific DML use cases. However I can't recall the
> use cases.
>
> --
> Denis
>
> On Sun, Apr 9, 2017 at 6:54 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Ok, we need to do 2 things here:
> >
> > 1. Drop the resolvers from the source code.
> > 2. Write a good page in docs on "What makes a correct cache key".
> >
> > Who can do that?
> >
> > Sergi
> >
> > 2017-04-07 9:48 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> >
> > > It is possible to try adding support of comparison to Resolvers, but
> the
> > > whole approach looks wrong and for now it is better to get rid of it
> > while
> > > we have a chance to break compatibility.
> > >
> > > Sergi
> > >
> > > 2017-04-07 9:19 GMT+03:00 Valentin Kulichenko <
> > > valentin.kuliche...@gmail.com>:
> > >
> > >> The discussion should've been started with that :) If supporting
> > resolvers
> > >> in new architecture is not possible or means too big effort, then it's
> > >> definitely not worth it.
> > >>
> > >> -Val
> > >>
> > >> On Thu, Apr 6, 2017 at 8:52 PM, Vladimir Ozerov <voze...@gridgain.com
> >
> > >> wrote:
> > >>
> > >> > Dima,
> > >> >
> > >> > Yes, they may explode some internals of our indexes.
> > >> >
> > >> > 06 апр. 2017 г. 23:32 пользователь "Dmitriy Setrakyan" <
> > >> > dsetrak...@apache.org> написал:
> > >> >
> > >> > > Guys,
> > >> > >
> > >> > > Isn't the main issue here that we cannot use the Identity
> Resolvers
> > in
> > >> > > BTrees in the 2.0 version? If yes, then we have to remove them no
> > >> matter
> > >> > > what.
> > >> > >
> > >> > > D.
> > >> > >
> > >> > > On Thu, Apr 6, 2017 at 1:23 PM, Sergi Vladykin <
> > >> sergi.vlady...@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Binary key representation is stable when we always have equal
> > >> > serialized
> > >> > > > bytes when the original keys are equal.
> > >> > > >
> > >> > > > Resolver allows you to have some extra info in the Key and equal
> > >> Keys
> > >> > > will
> > >> > > > be serialized into different bytes, which is wrong.
> > >> > > >
> > >> > > > Look at the example what you can do with resolvers:
> > >> > > >
> > >> > > > We may have some data entry with fields a, b, c. Let's say the
> > >> unique
> > >> > > part
> > >> > > > here is `a` and it the only fields used in Key equals() and
> > >> hashCode().
> > >> > > > Still we may have the following layouts:
> > >> > > >
> > >> > > > 1. Ka -> Vbc
> > >> > > > 2. Kab -> Vc
> > >> > > > 3. Kabc -> Boolean.TRUE
> > >> > > >
> > >> > > > The only 1 is a correct layout, others are plain wrong variants
> > (but
> > >> > they
> > >> > > > are still possible with Resolvers) because everything that does
> > not
> > >> > make
> > >> > > > Key unique must be in Value.
> > >> > > >
> > >> > > > We want to clearly state that if you have something in Key, that
> > is
> > >> not
> > >> > > > part of equals(), then the Key is invalid and that stuff must be
> > in
> > >> > > Value.
> > >> > > > This allows us to rely on binary representation of a Key to be
> > >> stable
> > >> > and
> > &g

  1   2   3   4   >