Re: IEP-54: Schema-first approach for 3.0

2021-06-04 Thread Andrey Mashenkov
Hi Igniters,

I and Alex Scherbakov had a discussion on how we could write rows in a more
compact way.
Many thanks to Alex for his ideas and critics.
So, in a long-read below I want to share some thoughts.

Motivation.

In Ignite 3.0 we will have versioned schema and most of the meta info will
be stored in the schema.
This approach gives lesser overheard on row size comparing to BinatyObject
in Ignite 2.0,
but I see we can still save up to 5-25% in some use cases that look
promising.

Apparently, there always is a trade-off between row footprint, field access
performance, and code complexity.
We won't fight to the death for every single byte, but for a relatively low
row footprint overhead.
but still can have different formats/technics for writing compact meta
(sizes, offsets ...) to cover common use cases.

Description.

Ignite table/index operations (but not only them) performance correlates
with the key size.
Because of this, we recommend having the smallest keys as possible, and
small keys like long, UUID, or short strings are widely used.

Value size may differ and depends on the use case. AFAIK some user needs
MB+ sized values.
I don't know if any corner cases take place in production, such as 100+
varlen short columns (especially short) or huge values, or keys > 64kb.
and if they are relevant to Ignite goals and target auditory.

Below I use the term 'chunk' meaning a key or value byte sequence.

Points and technics to save few bytes:
* Chunk size.
If the key is a single long value then using a single byte instead of short
may reduce overhead twice (25% -> 12%).

* Vartable item size (varlen column offset or varlen column size).
Here we can save noticeable amount of byte if the user has many short
varlen columns.
E.g. 10 short strings of 10 bytes (100 in total) can save 10%.

* Using varlen column sizes instead of offsets.
We can use items of 'byte' even if total chunk size do not fit into a byte.
E.g. if the user has 30 strings each of 10 chars (300 bytes in total) then
using 'sizes' of byte (instead of short) here we could save 10%.
This increases complexity (up to linear) to column offset calculation, but
I think we shouldn't bother about performance impact here.
Because CPU is cheap here: vartable items resides locally and in most cases
(32-64) varlen column sizes can fit into one cache line
and calculations can be effectively vectorized.

* Use 'varInt' format for sizes.
Shortly: VarInt format implies we use a sign bit as a flag if a data spans
over more bytes or not.
So, positive byte means byte value. A negative byte means value spans for
more bytes and we have to drop the sign bit and concat the rest 7-bits with
the next byte.
Thus, if a number fits a smaller type then we can use lesser bytes to store
it.
Total chunk size calculation may be a bit tricky

* Strings size precalculation.
The problem is we need to analyze characters to estimate string size before
start key/value serialization.
We can estimate sizes for long strings though, e.g. check symbol-by-symbol
for strings of 64-255, as char[63] will always fit byte[255] and char[256]
will never fit byte[255].
(with varInt format 32-127 bounds can be used).

There are other more restrictive ways:

* Varlen table (vartable) size of byte.
Does one need more than 255 varlen columns? E.g. Oracle has a limit of 1000
total columns.
Actually, the impact is low enough, we can save a byte per-varlen column.
Moreover, we already have optimization to skip the first varlen offset (or
last varlen length).
So, we will not write a vartable for a single varlen column in a chunk.

* Restrict varlen sizes to 64kb and introduce BLOB type for varlength >
64kb.
This allows excluding cases with items of 'int' in vartable. Therefore,
reduces the number of flags, chunk reader/writer implementations, and
overall code complexity.
I'd suggest discussing BLOB type in a separate thread and implements
separately.
Shortly, we can store BLOB 'uuid' in a row instead, and store BLOB content
in separate storage. RowAssembler can write row bytes and pairs
('uuid','content') separately to different arrays. The transport protocol
should be aware of BLOBs.

* Alex idea. We can have 2 varlen tables for small (len of < 255 bytes) and
large varlens (len of < 64k) with byte and short offsets correspondingly.
It is assumed varlen columns are sorted by their types (e.g. shorter first).
Thus can be effective if the user have a number of small varlens and a
larger one. The larger one will force us to use longer vartable items.
The drawback is a user must define max-length constraint for varlens
columns at a schema declaration step to turn on optimization for columns of
short types.
Because column order is defined in the schema, we can't resort to columns
for row in runtime and apply optimization for short values of long-type
columns.
E.g. user defines VARCHAR(1024) column in a schema and pass a short value
of 10 chars, we can't use first vartable item for that string as a second

Re: IEP-54: Schema-first approach for 3.0

2021-05-26 Thread StephanieSy
That's the same case for me!. I've just downgraded my typescript version and
everything starts working. How did you notice that the typescript's version
was the problem?



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/


Re: IEP-54: Schema-first approach for 3.0

2021-03-17 Thread Pavel Tupitsyn
I see, thanks.

Let's discuss the return type - Future is not the one to use.
We should return CompletionStage, CompletableFuture, or introduce our own
interface.
We agreed on the last one (custom interface) for thin clients:
http://apache-ignite-developers.2346864.n4.nabble.com/IEP-51-Java-Thin-Client-Async-API-td48900.html

I believe that for Ignite 3.0 we should have the following:
public interface IgniteFuture extends Future, CompletionStage {
// No-op.
}

Thoughts?


On Wed, Mar 17, 2021 at 11:16 AM Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Pavel,
> There are 2 PR's for the ticket[1] with two different APIs  suggested.
> Please, take a look at PR [2].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-14035
> [2] https://github.com/apache/ignite-3/pull/69
>
> On Wed, Mar 17, 2021 at 11:11 AM Pavel Tupitsyn 
> wrote:
>
> > Andrey, I can't find any async methods,
> > can you please check if the changes are pushed?
> >
> > On Tue, Mar 16, 2021 at 10:06 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com> wrote:
> >
> > > Pavel, good point.
> > > Thanks. I've added async methods.
> > >
> > > On Fri, Mar 12, 2021 at 2:29 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > Andrey,
> > > >
> > > > What about corresponding async APIs, do we add them now or later?
> > > >
> > > > On Thu, Mar 11, 2021 at 8:11 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Igniters.
> > > > >
> > > > > I've created a PR for Table access API [1].
> > > > > This is an initial version. So, any suggestions\objections are
> > > welcomed.
> > > > > Please, do not hesitate to write your comments and\or examples to
> the
> > > PR.
> > > > >
> > > > > Ignite-api module contains API classes, e.g. TableView classes as
> > > > > projections for a table for different purposes.
> > > > > Ignite-table contains dummy implementation and Example class
> > explained
> > > > how
> > > > > it is supposed to be used.
> > > > >
> > > > >
> > > > > Also, I'm still waiting for any feedback for Schema configuration
> > > public
> > > > > API PR [2].
> > > > >
> > > > > [1] https://github.com/apache/ignite-3/pull/33
> > > > > [2] https://github.com/apache/ignite-3/pull/2
> > > > >
> > > > > On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > I've updated a PR regarding your feedback [1].
> > > > > >
> > > > > > [1] https://github.com/apache/ignite-3/pull/2
> > > > > >
> > > > > > On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> > > > > > alexey.goncha...@gmail.com> wrote:
> > > > > >
> > > > > >> Folks,
> > > > > >>
> > > > > >> I updated the IEP to contain the missing pieces; actually, most
> of
> > > the
> > > > > >> questions here were covered by the text. Please let me know if
> > there
> > > > is
> > > > > >> something still missing or unclear.
> > > > > >>
> > > > > >> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk <
> > > > > alexey.goncha...@gmail.com
> > > > > >> >:
> > > > > >>
> > > > > >> > Mikhail and Igniters,
> > > > > >> >
> > > > > >> > Thanks for your comments. The questions are reasonable,
> though I
> > > > think
> > > > > >> all
> > > > > >> > concerns are addressed by the IEP as Val mentioned. I will
> > update
> > > > the
> > > > > >> > document according to your questions in the following week or
> > so,
> > > so
> > > > > we
> > > > > >> can
> > > > > >> > have a constructive discussion further.
> > > > > >> >
> > > > > >> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> > > > > >> > michael.cherka...@gmail.com>:
> > > > > >> >
> > > > > >> >> Hi Val, Andrey,
> > > > > >> >>
> > > > > >> >> thank you for clarifying.
> > > > > >> >>
> > > > > >> >> I still have a few comments.
> > > > > >> >>
> > > > > >> >> 1. one table == one schema. KV vs SQL:
> > > > > >> >> Looks like all agreed that KV is just a special case of a
> > regular
> > > > > table
> > > > > >> >> with (blob,blob) schema.
> > > > > >> >> I worry about the case when the user starts from KV case and
> > > later
> > > > > will
> > > > > >> >> try
> > > > > >> >> to expand it and try to leverage SQL for the existing KV
> table
> > it
> > > > > >> won't be
> > > > > >> >> able to do so and will require to reload data. which isn't
> > > > convenient
> > > > > >> and
> > > > > >> >> sometimes not even possible. Is it possible to extract a new
> > > field
> > > > > from
> > > > > >> >> (blob, blob) schema and apply index on it?
> > > > > >> >>
> > > > > >> >> 2. Could you please also list all ways of schema definition
> in
> > > the
> > > > > >> IEP? It
> > > > > >> >> significant change and I bet the main point of this IEP,
> > everyone
> > > > > hates
> > > > > >> >> QueryEntities, they are difficult to manage and in general,
> > it's
> > > > very
> > > > > >> >> confusing to have a data model(schemas) and node/cluster
> > > > > configuration
> > > > > >> in
> > > > > >> >> one place.
> > > > > >> >>
> > > > > >> >> 

Re: IEP-54: Schema-first approach for 3.0

2021-03-17 Thread Andrey Mashenkov
Pavel,
There are 2 PR's for the ticket[1] with two different APIs  suggested.
Please, take a look at PR [2].

[1] https://issues.apache.org/jira/browse/IGNITE-14035
[2] https://github.com/apache/ignite-3/pull/69

On Wed, Mar 17, 2021 at 11:11 AM Pavel Tupitsyn 
wrote:

> Andrey, I can't find any async methods,
> can you please check if the changes are pushed?
>
> On Tue, Mar 16, 2021 at 10:06 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com> wrote:
>
> > Pavel, good point.
> > Thanks. I've added async methods.
> >
> > On Fri, Mar 12, 2021 at 2:29 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Andrey,
> > >
> > > What about corresponding async APIs, do we add them now or later?
> > >
> > > On Thu, Mar 11, 2021 at 8:11 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > > Hi Igniters.
> > > >
> > > > I've created a PR for Table access API [1].
> > > > This is an initial version. So, any suggestions\objections are
> > welcomed.
> > > > Please, do not hesitate to write your comments and\or examples to the
> > PR.
> > > >
> > > > Ignite-api module contains API classes, e.g. TableView classes as
> > > > projections for a table for different purposes.
> > > > Ignite-table contains dummy implementation and Example class
> explained
> > > how
> > > > it is supposed to be used.
> > > >
> > > >
> > > > Also, I'm still waiting for any feedback for Schema configuration
> > public
> > > > API PR [2].
> > > >
> > > > [1] https://github.com/apache/ignite-3/pull/33
> > > > [2] https://github.com/apache/ignite-3/pull/2
> > > >
> > > > On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > >
> > > > > I've updated a PR regarding your feedback [1].
> > > > >
> > > > > [1] https://github.com/apache/ignite-3/pull/2
> > > > >
> > > > > On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> > > > > alexey.goncha...@gmail.com> wrote:
> > > > >
> > > > >> Folks,
> > > > >>
> > > > >> I updated the IEP to contain the missing pieces; actually, most of
> > the
> > > > >> questions here were covered by the text. Please let me know if
> there
> > > is
> > > > >> something still missing or unclear.
> > > > >>
> > > > >> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com
> > > > >> >:
> > > > >>
> > > > >> > Mikhail and Igniters,
> > > > >> >
> > > > >> > Thanks for your comments. The questions are reasonable, though I
> > > think
> > > > >> all
> > > > >> > concerns are addressed by the IEP as Val mentioned. I will
> update
> > > the
> > > > >> > document according to your questions in the following week or
> so,
> > so
> > > > we
> > > > >> can
> > > > >> > have a constructive discussion further.
> > > > >> >
> > > > >> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> > > > >> > michael.cherka...@gmail.com>:
> > > > >> >
> > > > >> >> Hi Val, Andrey,
> > > > >> >>
> > > > >> >> thank you for clarifying.
> > > > >> >>
> > > > >> >> I still have a few comments.
> > > > >> >>
> > > > >> >> 1. one table == one schema. KV vs SQL:
> > > > >> >> Looks like all agreed that KV is just a special case of a
> regular
> > > > table
> > > > >> >> with (blob,blob) schema.
> > > > >> >> I worry about the case when the user starts from KV case and
> > later
> > > > will
> > > > >> >> try
> > > > >> >> to expand it and try to leverage SQL for the existing KV table
> it
> > > > >> won't be
> > > > >> >> able to do so and will require to reload data. which isn't
> > > convenient
> > > > >> and
> > > > >> >> sometimes not even possible. Is it possible to extract a new
> > field
> > > > from
> > > > >> >> (blob, blob) schema and apply index on it?
> > > > >> >>
> > > > >> >> 2. Could you please also list all ways of schema definition in
> > the
> > > > >> IEP? It
> > > > >> >> significant change and I bet the main point of this IEP,
> everyone
> > > > hates
> > > > >> >> QueryEntities, they are difficult to manage and in general,
> it's
> > > very
> > > > >> >> confusing to have a data model(schemas) and node/cluster
> > > > configuration
> > > > >> in
> > > > >> >> one place.
> > > > >> >>
> > > > >> >> So there will be SchemaBuilder and SQL to define schemas, but
> > > Andrey
> > > > >> also
> > > > >> >> mentioned annotations.
> > > > >> >>
> > > > >> >> I personally against configuration via annotations, while it's
> > > > >> convenient
> > > > >> >> for development, it difficult to manage because different
> classes
> > > can
> > > > >> be
> > > > >> >> deployed on different clients/servers nodes and it can lead to
> > > > >> >> unpredictable results.
> > > > >> >>
> > > > >> >> 3. IEP doesn't mention field type changes, only drop/add
> fields.
> > > > Field
> > > > >> >> type
> > > > >> >> changes are extremely painful right now(if even possible), so
> it
> > > > would
> > > > >> be
> > > > >> >> nice if some scenarios would be supported(like int8->int16, or
> > > > >> >> int8->String).
> > > > >> >>
> > > > >> >> 4. got it, I 

Re: IEP-54: Schema-first approach for 3.0

2021-03-17 Thread Pavel Tupitsyn
Andrey, I can't find any async methods,
can you please check if the changes are pushed?

On Tue, Mar 16, 2021 at 10:06 PM Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Pavel, good point.
> Thanks. I've added async methods.
>
> On Fri, Mar 12, 2021 at 2:29 PM Pavel Tupitsyn 
> wrote:
>
> > Andrey,
> >
> > What about corresponding async APIs, do we add them now or later?
> >
> > On Thu, Mar 11, 2021 at 8:11 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Hi Igniters.
> > >
> > > I've created a PR for Table access API [1].
> > > This is an initial version. So, any suggestions\objections are
> welcomed.
> > > Please, do not hesitate to write your comments and\or examples to the
> PR.
> > >
> > > Ignite-api module contains API classes, e.g. TableView classes as
> > > projections for a table for different purposes.
> > > Ignite-table contains dummy implementation and Example class explained
> > how
> > > it is supposed to be used.
> > >
> > >
> > > Also, I'm still waiting for any feedback for Schema configuration
> public
> > > API PR [2].
> > >
> > > [1] https://github.com/apache/ignite-3/pull/33
> > > [2] https://github.com/apache/ignite-3/pull/2
> > >
> > > On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > >
> > > > I've updated a PR regarding your feedback [1].
> > > >
> > > > [1] https://github.com/apache/ignite-3/pull/2
> > > >
> > > > On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com> wrote:
> > > >
> > > >> Folks,
> > > >>
> > > >> I updated the IEP to contain the missing pieces; actually, most of
> the
> > > >> questions here were covered by the text. Please let me know if there
> > is
> > > >> something still missing or unclear.
> > > >>
> > > >> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com
> > > >> >:
> > > >>
> > > >> > Mikhail and Igniters,
> > > >> >
> > > >> > Thanks for your comments. The questions are reasonable, though I
> > think
> > > >> all
> > > >> > concerns are addressed by the IEP as Val mentioned. I will update
> > the
> > > >> > document according to your questions in the following week or so,
> so
> > > we
> > > >> can
> > > >> > have a constructive discussion further.
> > > >> >
> > > >> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> > > >> > michael.cherka...@gmail.com>:
> > > >> >
> > > >> >> Hi Val, Andrey,
> > > >> >>
> > > >> >> thank you for clarifying.
> > > >> >>
> > > >> >> I still have a few comments.
> > > >> >>
> > > >> >> 1. one table == one schema. KV vs SQL:
> > > >> >> Looks like all agreed that KV is just a special case of a regular
> > > table
> > > >> >> with (blob,blob) schema.
> > > >> >> I worry about the case when the user starts from KV case and
> later
> > > will
> > > >> >> try
> > > >> >> to expand it and try to leverage SQL for the existing KV table it
> > > >> won't be
> > > >> >> able to do so and will require to reload data. which isn't
> > convenient
> > > >> and
> > > >> >> sometimes not even possible. Is it possible to extract a new
> field
> > > from
> > > >> >> (blob, blob) schema and apply index on it?
> > > >> >>
> > > >> >> 2. Could you please also list all ways of schema definition in
> the
> > > >> IEP? It
> > > >> >> significant change and I bet the main point of this IEP, everyone
> > > hates
> > > >> >> QueryEntities, they are difficult to manage and in general, it's
> > very
> > > >> >> confusing to have a data model(schemas) and node/cluster
> > > configuration
> > > >> in
> > > >> >> one place.
> > > >> >>
> > > >> >> So there will be SchemaBuilder and SQL to define schemas, but
> > Andrey
> > > >> also
> > > >> >> mentioned annotations.
> > > >> >>
> > > >> >> I personally against configuration via annotations, while it's
> > > >> convenient
> > > >> >> for development, it difficult to manage because different classes
> > can
> > > >> be
> > > >> >> deployed on different clients/servers nodes and it can lead to
> > > >> >> unpredictable results.
> > > >> >>
> > > >> >> 3. IEP doesn't mention field type changes, only drop/add fields.
> > > Field
> > > >> >> type
> > > >> >> changes are extremely painful right now(if even possible), so it
> > > would
> > > >> be
> > > >> >> nice if some scenarios would be supported(like int8->int16, or
> > > >> >> int8->String).
> > > >> >>
> > > >> >> 4. got it, I thought IEP will have more details about the
> > > >> implementation.
> > > >> >> I've seen Andrey even sent benchmark results for a new
> > serialization,
> > > >> will
> > > >> >> ping him about this.
> > > >> >>
> > > >> >> 5. Thanks for the clarification. I had a wrong understanding of
> > > strick
> > > >> >> mode.
> > > >> >>
> > > >> >>
> > > >> >> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
> > > >> >> valentin.kuliche...@gmail.com>:
> > > >> >>
> > > >> >> > Hi Mike,
> > > >> >> >
> > > >> >> > Thanks for providing your feedback. Please see my comments
> 

Re: IEP-54: Schema-first approach for 3.0

2021-03-16 Thread Andrey Mashenkov
Pavel, good point.
Thanks. I've added async methods.

On Fri, Mar 12, 2021 at 2:29 PM Pavel Tupitsyn  wrote:

> Andrey,
>
> What about corresponding async APIs, do we add them now or later?
>
> On Thu, Mar 11, 2021 at 8:11 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> > Hi Igniters.
> >
> > I've created a PR for Table access API [1].
> > This is an initial version. So, any suggestions\objections are welcomed.
> > Please, do not hesitate to write your comments and\or examples to the PR.
> >
> > Ignite-api module contains API classes, e.g. TableView classes as
> > projections for a table for different purposes.
> > Ignite-table contains dummy implementation and Example class explained
> how
> > it is supposed to be used.
> >
> >
> > Also, I'm still waiting for any feedback for Schema configuration public
> > API PR [2].
> >
> > [1] https://github.com/apache/ignite-3/pull/33
> > [2] https://github.com/apache/ignite-3/pull/2
> >
> > On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > >
> > > I've updated a PR regarding your feedback [1].
> > >
> > > [1] https://github.com/apache/ignite-3/pull/2
> > >
> > > On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> > > alexey.goncha...@gmail.com> wrote:
> > >
> > >> Folks,
> > >>
> > >> I updated the IEP to contain the missing pieces; actually, most of the
> > >> questions here were covered by the text. Please let me know if there
> is
> > >> something still missing or unclear.
> > >>
> > >> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk <
> > alexey.goncha...@gmail.com
> > >> >:
> > >>
> > >> > Mikhail and Igniters,
> > >> >
> > >> > Thanks for your comments. The questions are reasonable, though I
> think
> > >> all
> > >> > concerns are addressed by the IEP as Val mentioned. I will update
> the
> > >> > document according to your questions in the following week or so, so
> > we
> > >> can
> > >> > have a constructive discussion further.
> > >> >
> > >> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> > >> > michael.cherka...@gmail.com>:
> > >> >
> > >> >> Hi Val, Andrey,
> > >> >>
> > >> >> thank you for clarifying.
> > >> >>
> > >> >> I still have a few comments.
> > >> >>
> > >> >> 1. one table == one schema. KV vs SQL:
> > >> >> Looks like all agreed that KV is just a special case of a regular
> > table
> > >> >> with (blob,blob) schema.
> > >> >> I worry about the case when the user starts from KV case and later
> > will
> > >> >> try
> > >> >> to expand it and try to leverage SQL for the existing KV table it
> > >> won't be
> > >> >> able to do so and will require to reload data. which isn't
> convenient
> > >> and
> > >> >> sometimes not even possible. Is it possible to extract a new field
> > from
> > >> >> (blob, blob) schema and apply index on it?
> > >> >>
> > >> >> 2. Could you please also list all ways of schema definition in the
> > >> IEP? It
> > >> >> significant change and I bet the main point of this IEP, everyone
> > hates
> > >> >> QueryEntities, they are difficult to manage and in general, it's
> very
> > >> >> confusing to have a data model(schemas) and node/cluster
> > configuration
> > >> in
> > >> >> one place.
> > >> >>
> > >> >> So there will be SchemaBuilder and SQL to define schemas, but
> Andrey
> > >> also
> > >> >> mentioned annotations.
> > >> >>
> > >> >> I personally against configuration via annotations, while it's
> > >> convenient
> > >> >> for development, it difficult to manage because different classes
> can
> > >> be
> > >> >> deployed on different clients/servers nodes and it can lead to
> > >> >> unpredictable results.
> > >> >>
> > >> >> 3. IEP doesn't mention field type changes, only drop/add fields.
> > Field
> > >> >> type
> > >> >> changes are extremely painful right now(if even possible), so it
> > would
> > >> be
> > >> >> nice if some scenarios would be supported(like int8->int16, or
> > >> >> int8->String).
> > >> >>
> > >> >> 4. got it, I thought IEP will have more details about the
> > >> implementation.
> > >> >> I've seen Andrey even sent benchmark results for a new
> serialization,
> > >> will
> > >> >> ping him about this.
> > >> >>
> > >> >> 5. Thanks for the clarification. I had a wrong understanding of
> > strick
> > >> >> mode.
> > >> >>
> > >> >>
> > >> >> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
> > >> >> valentin.kuliche...@gmail.com>:
> > >> >>
> > >> >> > Hi Mike,
> > >> >> >
> > >> >> > Thanks for providing your feedback. Please see my comments below.
> > >> >> >
> > >> >> > I would also encourage you to go through the IEP-54 [1] - it has
> a
> > >> lot
> > >> >> of
> > >> >> > detail on the topic.
> > >> >> >
> > >> >> > [1]
> > >> >> >
> > >> >> >
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > >> >> >
> > >> >> > -Val
> > >> >> >
> > >> >> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
> > >> >> > michael.cherka...@gmail.com> wrote:
> > >> >> >

Re: IEP-54: Schema-first approach for 3.0

2021-03-12 Thread Pavel Tupitsyn
Andrey,

What about corresponding async APIs, do we add them now or later?

On Thu, Mar 11, 2021 at 8:11 PM Andrey Mashenkov 
wrote:

> Hi Igniters.
>
> I've created a PR for Table access API [1].
> This is an initial version. So, any suggestions\objections are welcomed.
> Please, do not hesitate to write your comments and\or examples to the PR.
>
> Ignite-api module contains API classes, e.g. TableView classes as
> projections for a table for different purposes.
> Ignite-table contains dummy implementation and Example class explained how
> it is supposed to be used.
>
>
> Also, I'm still waiting for any feedback for Schema configuration public
> API PR [2].
>
> [1] https://github.com/apache/ignite-3/pull/33
> [2] https://github.com/apache/ignite-3/pull/2
>
> On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> >
> > I've updated a PR regarding your feedback [1].
> >
> > [1] https://github.com/apache/ignite-3/pull/2
> >
> > On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> > alexey.goncha...@gmail.com> wrote:
> >
> >> Folks,
> >>
> >> I updated the IEP to contain the missing pieces; actually, most of the
> >> questions here were covered by the text. Please let me know if there is
> >> something still missing or unclear.
> >>
> >> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk <
> alexey.goncha...@gmail.com
> >> >:
> >>
> >> > Mikhail and Igniters,
> >> >
> >> > Thanks for your comments. The questions are reasonable, though I think
> >> all
> >> > concerns are addressed by the IEP as Val mentioned. I will update the
> >> > document according to your questions in the following week or so, so
> we
> >> can
> >> > have a constructive discussion further.
> >> >
> >> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> >> > michael.cherka...@gmail.com>:
> >> >
> >> >> Hi Val, Andrey,
> >> >>
> >> >> thank you for clarifying.
> >> >>
> >> >> I still have a few comments.
> >> >>
> >> >> 1. one table == one schema. KV vs SQL:
> >> >> Looks like all agreed that KV is just a special case of a regular
> table
> >> >> with (blob,blob) schema.
> >> >> I worry about the case when the user starts from KV case and later
> will
> >> >> try
> >> >> to expand it and try to leverage SQL for the existing KV table it
> >> won't be
> >> >> able to do so and will require to reload data. which isn't convenient
> >> and
> >> >> sometimes not even possible. Is it possible to extract a new field
> from
> >> >> (blob, blob) schema and apply index on it?
> >> >>
> >> >> 2. Could you please also list all ways of schema definition in the
> >> IEP? It
> >> >> significant change and I bet the main point of this IEP, everyone
> hates
> >> >> QueryEntities, they are difficult to manage and in general, it's very
> >> >> confusing to have a data model(schemas) and node/cluster
> configuration
> >> in
> >> >> one place.
> >> >>
> >> >> So there will be SchemaBuilder and SQL to define schemas, but Andrey
> >> also
> >> >> mentioned annotations.
> >> >>
> >> >> I personally against configuration via annotations, while it's
> >> convenient
> >> >> for development, it difficult to manage because different classes can
> >> be
> >> >> deployed on different clients/servers nodes and it can lead to
> >> >> unpredictable results.
> >> >>
> >> >> 3. IEP doesn't mention field type changes, only drop/add fields.
> Field
> >> >> type
> >> >> changes are extremely painful right now(if even possible), so it
> would
> >> be
> >> >> nice if some scenarios would be supported(like int8->int16, or
> >> >> int8->String).
> >> >>
> >> >> 4. got it, I thought IEP will have more details about the
> >> implementation.
> >> >> I've seen Andrey even sent benchmark results for a new serialization,
> >> will
> >> >> ping him about this.
> >> >>
> >> >> 5. Thanks for the clarification. I had a wrong understanding of
> strick
> >> >> mode.
> >> >>
> >> >>
> >> >> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
> >> >> valentin.kuliche...@gmail.com>:
> >> >>
> >> >> > Hi Mike,
> >> >> >
> >> >> > Thanks for providing your feedback. Please see my comments below.
> >> >> >
> >> >> > I would also encourage you to go through the IEP-54 [1] - it has a
> >> lot
> >> >> of
> >> >> > detail on the topic.
> >> >> >
> >> >> > [1]
> >> >> >
> >> >> >
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >> >> >
> >> >> > -Val
> >> >> >
> >> >> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
> >> >> > michael.cherka...@gmail.com> wrote:
> >> >> >
> >> >> > > Hi all,
> >> >> > >
> >> >> > > I reviewed the mail thread and proposal page and I still don't
> >> fully
> >> >> > > understand what is going to be changed, I would really appreciate
> >> it
> >> >> if
> >> >> > you
> >> >> > > will answer a few questions:
> >> >> > >
> >> >> > > 1. Are you going to leave only one schema per cache? if so, will
> be
> >> >> there
> >> >> > > an option to have a table with arbitrary objects(pure KV 

Re: IEP-54: Schema-first approach for 3.0

2021-03-11 Thread Andrey Mashenkov
Hi Igniters.

I've created a PR for Table access API [1].
This is an initial version. So, any suggestions\objections are welcomed.
Please, do not hesitate to write your comments and\or examples to the PR.

Ignite-api module contains API classes, e.g. TableView classes as
projections for a table for different purposes.
Ignite-table contains dummy implementation and Example class explained how
it is supposed to be used.


Also, I'm still waiting for any feedback for Schema configuration public
API PR [2].

[1] https://github.com/apache/ignite-3/pull/33
[2] https://github.com/apache/ignite-3/pull/2

On Wed, Jan 20, 2021 at 6:05 PM Andrey Mashenkov 
wrote:

>
> I've updated a PR regarding your feedback [1].
>
> [1] https://github.com/apache/ignite-3/pull/2
>
> On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
> alexey.goncha...@gmail.com> wrote:
>
>> Folks,
>>
>> I updated the IEP to contain the missing pieces; actually, most of the
>> questions here were covered by the text. Please let me know if there is
>> something still missing or unclear.
>>
>> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk > >:
>>
>> > Mikhail and Igniters,
>> >
>> > Thanks for your comments. The questions are reasonable, though I think
>> all
>> > concerns are addressed by the IEP as Val mentioned. I will update the
>> > document according to your questions in the following week or so, so we
>> can
>> > have a constructive discussion further.
>> >
>> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
>> > michael.cherka...@gmail.com>:
>> >
>> >> Hi Val, Andrey,
>> >>
>> >> thank you for clarifying.
>> >>
>> >> I still have a few comments.
>> >>
>> >> 1. one table == one schema. KV vs SQL:
>> >> Looks like all agreed that KV is just a special case of a regular table
>> >> with (blob,blob) schema.
>> >> I worry about the case when the user starts from KV case and later will
>> >> try
>> >> to expand it and try to leverage SQL for the existing KV table it
>> won't be
>> >> able to do so and will require to reload data. which isn't convenient
>> and
>> >> sometimes not even possible. Is it possible to extract a new field from
>> >> (blob, blob) schema and apply index on it?
>> >>
>> >> 2. Could you please also list all ways of schema definition in the
>> IEP? It
>> >> significant change and I bet the main point of this IEP, everyone hates
>> >> QueryEntities, they are difficult to manage and in general, it's very
>> >> confusing to have a data model(schemas) and node/cluster configuration
>> in
>> >> one place.
>> >>
>> >> So there will be SchemaBuilder and SQL to define schemas, but Andrey
>> also
>> >> mentioned annotations.
>> >>
>> >> I personally against configuration via annotations, while it's
>> convenient
>> >> for development, it difficult to manage because different classes can
>> be
>> >> deployed on different clients/servers nodes and it can lead to
>> >> unpredictable results.
>> >>
>> >> 3. IEP doesn't mention field type changes, only drop/add fields. Field
>> >> type
>> >> changes are extremely painful right now(if even possible), so it would
>> be
>> >> nice if some scenarios would be supported(like int8->int16, or
>> >> int8->String).
>> >>
>> >> 4. got it, I thought IEP will have more details about the
>> implementation.
>> >> I've seen Andrey even sent benchmark results for a new serialization,
>> will
>> >> ping him about this.
>> >>
>> >> 5. Thanks for the clarification. I had a wrong understanding of strick
>> >> mode.
>> >>
>> >>
>> >> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
>> >> valentin.kuliche...@gmail.com>:
>> >>
>> >> > Hi Mike,
>> >> >
>> >> > Thanks for providing your feedback. Please see my comments below.
>> >> >
>> >> > I would also encourage you to go through the IEP-54 [1] - it has a
>> lot
>> >> of
>> >> > detail on the topic.
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>> >> >
>> >> > -Val
>> >> >
>> >> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
>> >> > michael.cherka...@gmail.com> wrote:
>> >> >
>> >> > > Hi all,
>> >> > >
>> >> > > I reviewed the mail thread and proposal page and I still don't
>> fully
>> >> > > understand what is going to be changed, I would really appreciate
>> it
>> >> if
>> >> > you
>> >> > > will answer a few questions:
>> >> > >
>> >> > > 1. Are you going to leave only one schema per cache? if so, will be
>> >> there
>> >> > > an option to have a table with arbitrary objects(pure KV case)?
>> >> > >
>> >> >
>> >> > My opinion is that KV case should be natively supported. I think this
>> >> still
>> >> > needs to be thought over, my current view on this is that we should
>> have
>> >> > separate APIs for KV and more generic storages. KV storage can be
>> >> > implemented as a "table" with two BLOB fields where we will store
>> >> > serialized key-value pairs. That would imply deserialization on read,
>> >> but I
>> >> > believe this is OK for KV use cases. I'm happy to 

Re: IEP-54: Schema-first approach for 3.0

2021-01-20 Thread Andrey Mashenkov
I've updated a PR regarding your feedback [1].

[1] https://github.com/apache/ignite-3/pull/2

On Mon, Jan 11, 2021 at 10:58 AM Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

> Folks,
>
> I updated the IEP to contain the missing pieces; actually, most of the
> questions here were covered by the text. Please let me know if there is
> something still missing or unclear.
>
> чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk  >:
>
> > Mikhail and Igniters,
> >
> > Thanks for your comments. The questions are reasonable, though I think
> all
> > concerns are addressed by the IEP as Val mentioned. I will update the
> > document according to your questions in the following week or so, so we
> can
> > have a constructive discussion further.
> >
> > ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> > michael.cherka...@gmail.com>:
> >
> >> Hi Val, Andrey,
> >>
> >> thank you for clarifying.
> >>
> >> I still have a few comments.
> >>
> >> 1. one table == one schema. KV vs SQL:
> >> Looks like all agreed that KV is just a special case of a regular table
> >> with (blob,blob) schema.
> >> I worry about the case when the user starts from KV case and later will
> >> try
> >> to expand it and try to leverage SQL for the existing KV table it won't
> be
> >> able to do so and will require to reload data. which isn't convenient
> and
> >> sometimes not even possible. Is it possible to extract a new field from
> >> (blob, blob) schema and apply index on it?
> >>
> >> 2. Could you please also list all ways of schema definition in the IEP?
> It
> >> significant change and I bet the main point of this IEP, everyone hates
> >> QueryEntities, they are difficult to manage and in general, it's very
> >> confusing to have a data model(schemas) and node/cluster configuration
> in
> >> one place.
> >>
> >> So there will be SchemaBuilder and SQL to define schemas, but Andrey
> also
> >> mentioned annotations.
> >>
> >> I personally against configuration via annotations, while it's
> convenient
> >> for development, it difficult to manage because different classes can be
> >> deployed on different clients/servers nodes and it can lead to
> >> unpredictable results.
> >>
> >> 3. IEP doesn't mention field type changes, only drop/add fields. Field
> >> type
> >> changes are extremely painful right now(if even possible), so it would
> be
> >> nice if some scenarios would be supported(like int8->int16, or
> >> int8->String).
> >>
> >> 4. got it, I thought IEP will have more details about the
> implementation.
> >> I've seen Andrey even sent benchmark results for a new serialization,
> will
> >> ping him about this.
> >>
> >> 5. Thanks for the clarification. I had a wrong understanding of strick
> >> mode.
> >>
> >>
> >> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
> >> valentin.kuliche...@gmail.com>:
> >>
> >> > Hi Mike,
> >> >
> >> > Thanks for providing your feedback. Please see my comments below.
> >> >
> >> > I would also encourage you to go through the IEP-54 [1] - it has a lot
> >> of
> >> > detail on the topic.
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >> >
> >> > -Val
> >> >
> >> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
> >> > michael.cherka...@gmail.com> wrote:
> >> >
> >> > > Hi all,
> >> > >
> >> > > I reviewed the mail thread and proposal page and I still don't fully
> >> > > understand what is going to be changed, I would really appreciate it
> >> if
> >> > you
> >> > > will answer a few questions:
> >> > >
> >> > > 1. Are you going to leave only one schema per cache? if so, will be
> >> there
> >> > > an option to have a table with arbitrary objects(pure KV case)?
> >> > >
> >> >
> >> > My opinion is that KV case should be natively supported. I think this
> >> still
> >> > needs to be thought over, my current view on this is that we should
> have
> >> > separate APIs for KV and more generic storages. KV storage can be
> >> > implemented as a "table" with two BLOB fields where we will store
> >> > serialized key-value pairs. That would imply deserialization on read,
> >> but I
> >> > believe this is OK for KV use cases. I'm happy to hear other ideas
> >> though
> >> > :)
> >> >
> >> >
> >> > > 2. What options will Apache Ignite 3.0 have to define schema?
> >> > SchemaBuilder
> >> > > and SQL only? Is there an option to put the schema definition to the
> >> > > configuration?(I really don't like this, I would prefer to have
> >> > > separate scripts to create schemas)
> >> > >
> >> >
> >> > There will be no such thing as a static configuration in the first
> >> place.
> >> > Tables and schemas are created in runtime. Even if there is a file
> >> provided
> >> > on node startup, this file is only applied in the scope of the 'start'
> >> > operation. All configurations will be stored in a meta storage
> >> available to
> >> > all nodes, as opposed to individual files.
> >> >
> >> >
> >> > > 3. Is there a way to change field 

Re: IEP-54: Schema-first approach for 3.0

2021-01-10 Thread Alexey Goncharuk
Folks,

I updated the IEP to contain the missing pieces; actually, most of the
questions here were covered by the text. Please let me know if there is
something still missing or unclear.

чт, 31 дек. 2020 г. в 12:48, Alexey Goncharuk :

> Mikhail and Igniters,
>
> Thanks for your comments. The questions are reasonable, though I think all
> concerns are addressed by the IEP as Val mentioned. I will update the
> document according to your questions in the following week or so, so we can
> have a constructive discussion further.
>
> ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov <
> michael.cherka...@gmail.com>:
>
>> Hi Val, Andrey,
>>
>> thank you for clarifying.
>>
>> I still have a few comments.
>>
>> 1. one table == one schema. KV vs SQL:
>> Looks like all agreed that KV is just a special case of a regular table
>> with (blob,blob) schema.
>> I worry about the case when the user starts from KV case and later will
>> try
>> to expand it and try to leverage SQL for the existing KV table it won't be
>> able to do so and will require to reload data. which isn't convenient and
>> sometimes not even possible. Is it possible to extract a new field from
>> (blob, blob) schema and apply index on it?
>>
>> 2. Could you please also list all ways of schema definition in the IEP? It
>> significant change and I bet the main point of this IEP, everyone hates
>> QueryEntities, they are difficult to manage and in general, it's very
>> confusing to have a data model(schemas) and node/cluster configuration in
>> one place.
>>
>> So there will be SchemaBuilder and SQL to define schemas, but Andrey also
>> mentioned annotations.
>>
>> I personally against configuration via annotations, while it's convenient
>> for development, it difficult to manage because different classes can be
>> deployed on different clients/servers nodes and it can lead to
>> unpredictable results.
>>
>> 3. IEP doesn't mention field type changes, only drop/add fields. Field
>> type
>> changes are extremely painful right now(if even possible), so it would be
>> nice if some scenarios would be supported(like int8->int16, or
>> int8->String).
>>
>> 4. got it, I thought IEP will have more details about the implementation.
>> I've seen Andrey even sent benchmark results for a new serialization, will
>> ping him about this.
>>
>> 5. Thanks for the clarification. I had a wrong understanding of strick
>> mode.
>>
>>
>> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
>> valentin.kuliche...@gmail.com>:
>>
>> > Hi Mike,
>> >
>> > Thanks for providing your feedback. Please see my comments below.
>> >
>> > I would also encourage you to go through the IEP-54 [1] - it has a lot
>> of
>> > detail on the topic.
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>> >
>> > -Val
>> >
>> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
>> > michael.cherka...@gmail.com> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I reviewed the mail thread and proposal page and I still don't fully
>> > > understand what is going to be changed, I would really appreciate it
>> if
>> > you
>> > > will answer a few questions:
>> > >
>> > > 1. Are you going to leave only one schema per cache? if so, will be
>> there
>> > > an option to have a table with arbitrary objects(pure KV case)?
>> > >
>> >
>> > My opinion is that KV case should be natively supported. I think this
>> still
>> > needs to be thought over, my current view on this is that we should have
>> > separate APIs for KV and more generic storages. KV storage can be
>> > implemented as a "table" with two BLOB fields where we will store
>> > serialized key-value pairs. That would imply deserialization on read,
>> but I
>> > believe this is OK for KV use cases. I'm happy to hear other ideas
>> though
>> > :)
>> >
>> >
>> > > 2. What options will Apache Ignite 3.0 have to define schema?
>> > SchemaBuilder
>> > > and SQL only? Is there an option to put the schema definition to the
>> > > configuration?(I really don't like this, I would prefer to have
>> > > separate scripts to create schemas)
>> > >
>> >
>> > There will be no such thing as a static configuration in the first
>> place.
>> > Tables and schemas are created in runtime. Even if there is a file
>> provided
>> > on node startup, this file is only applied in the scope of the 'start'
>> > operation. All configurations will be stored in a meta storage
>> available to
>> > all nodes, as opposed to individual files.
>> >
>> >
>> > > 3. Is there a way to change field type? if yes, can it be done in
>> > runtime?
>> > >
>> >
>> > Absolutely! IEP-54 has a whole section about schema evolution.
>> >
>> >
>> > > 4. Looks like BinaryMarshaller is going to be re-worked too, is there
>> any
>> > > IEP for this?
>> > >
>> >
>> > BinaryMarshaller as a tool for arbitrary object serialization will be
>> gone,
>> > but we will reuse a lot of its concept to implement an internal tuple
>> > serialization mechanism. IEP-54 has the 

Re: IEP-54: Schema-first approach for 3.0

2020-12-31 Thread Alexey Goncharuk
Mikhail and Igniters,

Thanks for your comments. The questions are reasonable, though I think all
concerns are addressed by the IEP as Val mentioned. I will update the
document according to your questions in the following week or so, so we can
have a constructive discussion further.

ср, 30 дек. 2020 г. в 11:45, Michael Cherkasov :

> Hi Val, Andrey,
>
> thank you for clarifying.
>
> I still have a few comments.
>
> 1. one table == one schema. KV vs SQL:
> Looks like all agreed that KV is just a special case of a regular table
> with (blob,blob) schema.
> I worry about the case when the user starts from KV case and later will try
> to expand it and try to leverage SQL for the existing KV table it won't be
> able to do so and will require to reload data. which isn't convenient and
> sometimes not even possible. Is it possible to extract a new field from
> (blob, blob) schema and apply index on it?
>
> 2. Could you please also list all ways of schema definition in the IEP? It
> significant change and I bet the main point of this IEP, everyone hates
> QueryEntities, they are difficult to manage and in general, it's very
> confusing to have a data model(schemas) and node/cluster configuration in
> one place.
>
> So there will be SchemaBuilder and SQL to define schemas, but Andrey also
> mentioned annotations.
>
> I personally against configuration via annotations, while it's convenient
> for development, it difficult to manage because different classes can be
> deployed on different clients/servers nodes and it can lead to
> unpredictable results.
>
> 3. IEP doesn't mention field type changes, only drop/add fields. Field type
> changes are extremely painful right now(if even possible), so it would be
> nice if some scenarios would be supported(like int8->int16, or
> int8->String).
>
> 4. got it, I thought IEP will have more details about the implementation.
> I've seen Andrey even sent benchmark results for a new serialization, will
> ping him about this.
>
> 5. Thanks for the clarification. I had a wrong understanding of strick
> mode.
>
>
> вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
> valentin.kuliche...@gmail.com>:
>
> > Hi Mike,
> >
> > Thanks for providing your feedback. Please see my comments below.
> >
> > I would also encourage you to go through the IEP-54 [1] - it has a lot of
> > detail on the topic.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >
> > -Val
> >
> > On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
> > michael.cherka...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I reviewed the mail thread and proposal page and I still don't fully
> > > understand what is going to be changed, I would really appreciate it if
> > you
> > > will answer a few questions:
> > >
> > > 1. Are you going to leave only one schema per cache? if so, will be
> there
> > > an option to have a table with arbitrary objects(pure KV case)?
> > >
> >
> > My opinion is that KV case should be natively supported. I think this
> still
> > needs to be thought over, my current view on this is that we should have
> > separate APIs for KV and more generic storages. KV storage can be
> > implemented as a "table" with two BLOB fields where we will store
> > serialized key-value pairs. That would imply deserialization on read,
> but I
> > believe this is OK for KV use cases. I'm happy to hear other ideas though
> > :)
> >
> >
> > > 2. What options will Apache Ignite 3.0 have to define schema?
> > SchemaBuilder
> > > and SQL only? Is there an option to put the schema definition to the
> > > configuration?(I really don't like this, I would prefer to have
> > > separate scripts to create schemas)
> > >
> >
> > There will be no such thing as a static configuration in the first place.
> > Tables and schemas are created in runtime. Even if there is a file
> provided
> > on node startup, this file is only applied in the scope of the 'start'
> > operation. All configurations will be stored in a meta storage available
> to
> > all nodes, as opposed to individual files.
> >
> >
> > > 3. Is there a way to change field type? if yes, can it be done in
> > runtime?
> > >
> >
> > Absolutely! IEP-54 has a whole section about schema evolution.
> >
> >
> > > 4. Looks like BinaryMarshaller is going to be re-worked too, is there
> any
> > > IEP for this?
> > >
> >
> > BinaryMarshaller as a tool for arbitrary object serialization will be
> gone,
> > but we will reuse a lot of its concept to implement an internal tuple
> > serialization mechanism. IEP-54 has the description of the proposed data
> > format.
> >
> >
> > > 5. I don't like automatic schema evaluation when a new field is added
> > > automatically on record put, so is there a way to prohibit this
> behavior?
> > >  I think all schema changes should be done only explicitly except
> initial
> > > schema creation.
> > >
> >
> > The way I see it is that we should have two modes: schema-first and
> > schema-last. 

Re: IEP-54: Schema-first approach for 3.0

2020-12-30 Thread Michael Cherkasov
Hi Val, Andrey,

thank you for clarifying.

I still have a few comments.

1. one table == one schema. KV vs SQL:
Looks like all agreed that KV is just a special case of a regular table
with (blob,blob) schema.
I worry about the case when the user starts from KV case and later will try
to expand it and try to leverage SQL for the existing KV table it won't be
able to do so and will require to reload data. which isn't convenient and
sometimes not even possible. Is it possible to extract a new field from
(blob, blob) schema and apply index on it?

2. Could you please also list all ways of schema definition in the IEP? It
significant change and I bet the main point of this IEP, everyone hates
QueryEntities, they are difficult to manage and in general, it's very
confusing to have a data model(schemas) and node/cluster configuration in
one place.

So there will be SchemaBuilder and SQL to define schemas, but Andrey also
mentioned annotations.

I personally against configuration via annotations, while it's convenient
for development, it difficult to manage because different classes can be
deployed on different clients/servers nodes and it can lead to
unpredictable results.

3. IEP doesn't mention field type changes, only drop/add fields. Field type
changes are extremely painful right now(if even possible), so it would be
nice if some scenarios would be supported(like int8->int16, or
int8->String).

4. got it, I thought IEP will have more details about the implementation.
I've seen Andrey even sent benchmark results for a new serialization, will
ping him about this.

5. Thanks for the clarification. I had a wrong understanding of strick mode.


вт, 29 дек. 2020 г. в 19:32, Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Hi Mike,
>
> Thanks for providing your feedback. Please see my comments below.
>
> I would also encourage you to go through the IEP-54 [1] - it has a lot of
> detail on the topic.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>
> -Val
>
> On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
> michael.cherka...@gmail.com> wrote:
>
> > Hi all,
> >
> > I reviewed the mail thread and proposal page and I still don't fully
> > understand what is going to be changed, I would really appreciate it if
> you
> > will answer a few questions:
> >
> > 1. Are you going to leave only one schema per cache? if so, will be there
> > an option to have a table with arbitrary objects(pure KV case)?
> >
>
> My opinion is that KV case should be natively supported. I think this still
> needs to be thought over, my current view on this is that we should have
> separate APIs for KV and more generic storages. KV storage can be
> implemented as a "table" with two BLOB fields where we will store
> serialized key-value pairs. That would imply deserialization on read, but I
> believe this is OK for KV use cases. I'm happy to hear other ideas though
> :)
>
>
> > 2. What options will Apache Ignite 3.0 have to define schema?
> SchemaBuilder
> > and SQL only? Is there an option to put the schema definition to the
> > configuration?(I really don't like this, I would prefer to have
> > separate scripts to create schemas)
> >
>
> There will be no such thing as a static configuration in the first place.
> Tables and schemas are created in runtime. Even if there is a file provided
> on node startup, this file is only applied in the scope of the 'start'
> operation. All configurations will be stored in a meta storage available to
> all nodes, as opposed to individual files.
>
>
> > 3. Is there a way to change field type? if yes, can it be done in
> runtime?
> >
>
> Absolutely! IEP-54 has a whole section about schema evolution.
>
>
> > 4. Looks like BinaryMarshaller is going to be re-worked too, is there any
> > IEP for this?
> >
>
> BinaryMarshaller as a tool for arbitrary object serialization will be gone,
> but we will reuse a lot of its concept to implement an internal tuple
> serialization mechanism. IEP-54 has the description of the proposed data
> format.
>
>
> > 5. I don't like automatic schema evaluation when a new field is added
> > automatically on record put, so is there a way to prohibit this behavior?
> >  I think all schema changes should be done only explicitly except initial
> > schema creation.
> >
>
> The way I see it is that we should have two modes: schema-first and
> schema-last. Schema-first means exactly what you've described - schemas are
> defined and updated explicitly by the user. In the schema-last mode,
> the user does not deal with schemas, as they are inferred from the data
> inserted into tables. We should definitely not mix these modes - it has to
> be one or another. And it probably makes sense to discuss which mode should
> be the default one.
>
>
> >
> > Thanks,
> > Mike.
> >
> > пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> >
> > > Hi, Igniters.
> > >
> > > We all know that the current QueryEntity API is 

Re: IEP-54: Schema-first approach for 3.0

2020-12-30 Thread Ilya Kasnacheev
Hello!

I'm not sure why you will need a schema definition API in Java if it's not
available with ignitectl?

After all, you can run SQL with Java as well. If we plan that every cache
is also a SQL table, why do we need the API being discussed? If it is
needed, how to rationalize that not all platforms support it?

Regards,
-- 
Ilya Kasnacheev


вт, 29 дек. 2020 г. в 13:11, Andrey Mashenkov :

> Ilya,
>
> Ignitectl could use SQL query with ALTER command, right?
> I'm not sure we have to bother about it.
> Anyway, one can define hierarchical properties and implement parser which
> could create a schema using public API schema builders.
>
> On Tue, Dec 29, 2020 at 12:02 PM Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>
> wrote:
>
> > Hello!
> >
> > Is there a way to define schema via ignitectl utility and its
> > hierarchical properties? How would it look like?
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пн, 21 дек. 2020 г. в 17:40, Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> >
> > > Hi, Igniters.
> > >
> > > We all know that the current QueryEntity API is not convenient and
> needs
> > to
> > > be reworked.
> > > So, I'm glad to share PR [1] with schema configuration public API for
> > > Ignite 3.0.
> > >
> > > New schema configuration uses Builder pattern, which looks more
> > comfortable
> > > to use.
> > >
> > > In the PR you will find a 'schema' package with the API itself, and a
> > draft
> > > implementation in 'internal' sub-package,
> > > and a test that demonstrates how the API could be used.
> > >
> > > Please note:
> > >
> > > * Entrypoint is 'SchemaBuilders' class with static factory methods.
> > > * The implementation is decoupled and can be easily extracted to
> separate
> > > module if we decide to do so.
> > > * Some columns types (e.g. Date/Time) are missed, they will be added
> > lately
> > > in separate tickes.
> > > * Index configuration extends marker interface that makes possible to
> > > implement indexes of new types in plugins.
> > > Hopfully, we could add a persistent geo-indices support in future.
> > > * Supposedly, current table schema can be changed via builder-like
> > > structure as it is done if JOOQ project. See 'TableModificationBuilder'
> > for
> > > details.
> > > I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
> > > purpose as it is a Schema Manager responsibility to create mutator
> > objects
> > > from the current schema,
> > > but implementing the Schema manager is out of scope and will be
> designed
> > > within the next task.
> > > * Interfaces implementations are out of scope. I did not intend to
> merge
> > > them right now, but for test/demostration purposes.
> > >
> > > It is NOT the final version and some may be changed before the first
> > > release of course.
> > > For now, we have to agree if we can proceed with this approach or some
> > > issues should be resolved at first.
> > >
> > > Any thoughts or objections?
> > > Are interfaces good enough to be merged within the current ticket?
> > >
> > >
> > > https://issues.apache.org/jira/browse/IGNITE-13748
> > >
> > > On Thu, Nov 26, 2020 at 2:33 PM Юрий 
> > wrote:
> > >
> > > > A little bit my thoughts about unsigned types:
> > > >
> > > > 1. Seems we may support unsign types
> > > > 2. It requires adding new types to the internal representation,
> > protocol,
> > > > e.t.c.
> > > > 3. internal representation should be the same as we keep sign types.
> So
> > > it
> > > > will not requires more memory
> > > > 4. User should be aware of specifics such types for platforms which
> not
> > > > support unsigned types. For example, a user could derive -6 value in
> > Java
> > > > for 250 unsigned byte value (from bits perspective will be right). I
> > > think
> > > > We shouldn't use more wide type for such cases, especially it will be
> > bad
> > > > for unsigned long when we require returns BigInteger type.
> > > > 5. Possible it requires some suffix/preffix for new types like a
> > '250u' -
> > > > it means that 250 is an unsigned value type.
> > > > 6. It requires a little bit more expensive comparison logic for
> indexes
> > > > 7. It requires new comparison logic for expressions. I think it not
> > > > possible for the current H2 engine and probably possible for the new
> > > > Calcite engine. Need clarification from anybody who involved in this
> > part
> > > >
> > > > WDYT?
> > > >
> > > > вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com
> > > > >:
> > > >
> > > > > Actually, we can support comparisons in 3.0: once we the actual
> type
> > > > > information, we can make proper runtime adjustments and conversions
> > to
> > > > > treat those values as unsigned - it will be just a bit more
> > expensive.
> > > > >
> > > > > вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn  >:
> > > > >
> > > > > > > SQL range queries it will break
> > > > > > > WHERE x > y may return wrong results
> > > > > >
> > > > > > Yes, range queries, 

Re: IEP-54: Schema-first approach for 3.0

2020-12-29 Thread Valentin Kulichenko
Hi Mike,

Thanks for providing your feedback. Please see my comments below.

I would also encourage you to go through the IEP-54 [1] - it has a lot of
detail on the topic.

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach

-Val

On Mon, Dec 28, 2020 at 11:22 PM Michael Cherkasov <
michael.cherka...@gmail.com> wrote:

> Hi all,
>
> I reviewed the mail thread and proposal page and I still don't fully
> understand what is going to be changed, I would really appreciate it if you
> will answer a few questions:
>
> 1. Are you going to leave only one schema per cache? if so, will be there
> an option to have a table with arbitrary objects(pure KV case)?
>

My opinion is that KV case should be natively supported. I think this still
needs to be thought over, my current view on this is that we should have
separate APIs for KV and more generic storages. KV storage can be
implemented as a "table" with two BLOB fields where we will store
serialized key-value pairs. That would imply deserialization on read, but I
believe this is OK for KV use cases. I'm happy to hear other ideas though :)


> 2. What options will Apache Ignite 3.0 have to define schema? SchemaBuilder
> and SQL only? Is there an option to put the schema definition to the
> configuration?(I really don't like this, I would prefer to have
> separate scripts to create schemas)
>

There will be no such thing as a static configuration in the first place.
Tables and schemas are created in runtime. Even if there is a file provided
on node startup, this file is only applied in the scope of the 'start'
operation. All configurations will be stored in a meta storage available to
all nodes, as opposed to individual files.


> 3. Is there a way to change field type? if yes, can it be done in runtime?
>

Absolutely! IEP-54 has a whole section about schema evolution.


> 4. Looks like BinaryMarshaller is going to be re-worked too, is there any
> IEP for this?
>

BinaryMarshaller as a tool for arbitrary object serialization will be gone,
but we will reuse a lot of its concept to implement an internal tuple
serialization mechanism. IEP-54 has the description of the proposed data
format.


> 5. I don't like automatic schema evaluation when a new field is added
> automatically on record put, so is there a way to prohibit this behavior?
>  I think all schema changes should be done only explicitly except initial
> schema creation.
>

The way I see it is that we should have two modes: schema-first and
schema-last. Schema-first means exactly what you've described - schemas are
defined and updated explicitly by the user. In the schema-last mode,
the user does not deal with schemas, as they are inferred from the data
inserted into tables. We should definitely not mix these modes - it has to
be one or another. And it probably makes sense to discuss which mode should
be the default one.


>
> Thanks,
> Mike.
>
> пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov  >:
>
> > Hi, Igniters.
> >
> > We all know that the current QueryEntity API is not convenient and needs
> to
> > be reworked.
> > So, I'm glad to share PR [1] with schema configuration public API for
> > Ignite 3.0.
> >
> > New schema configuration uses Builder pattern, which looks more
> comfortable
> > to use.
> >
> > In the PR you will find a 'schema' package with the API itself, and a
> draft
> > implementation in 'internal' sub-package,
> > and a test that demonstrates how the API could be used.
> >
> > Please note:
> >
> > * Entrypoint is 'SchemaBuilders' class with static factory methods.
> > * The implementation is decoupled and can be easily extracted to separate
> > module if we decide to do so.
> > * Some columns types (e.g. Date/Time) are missed, they will be added
> lately
> > in separate tickes.
> > * Index configuration extends marker interface that makes possible to
> > implement indexes of new types in plugins.
> > Hopfully, we could add a persistent geo-indices support in future.
> > * Supposedly, current table schema can be changed via builder-like
> > structure as it is done if JOOQ project. See 'TableModificationBuilder'
> for
> > details.
> > I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
> > purpose as it is a Schema Manager responsibility to create mutator
> objects
> > from the current schema,
> > but implementing the Schema manager is out of scope and will be designed
> > within the next task.
> > * Interfaces implementations are out of scope. I did not intend to merge
> > them right now, but for test/demostration purposes.
> >
> > It is NOT the final version and some may be changed before the first
> > release of course.
> > For now, we have to agree if we can proceed with this approach or some
> > issues should be resolved at first.
> >
> > Any thoughts or objections?
> > Are interfaces good enough to be merged within the current ticket?
> >
> >
> > https://issues.apache.org/jira/browse/IGNITE-13748
> >
> > On Thu, Nov 26, 2020 at 2:33 

Re: IEP-54: Schema-first approach for 3.0

2020-12-29 Thread Andrey Mashenkov
Michael, thanks for feedback.

1. However, there is no decision approved by the community, but I heard a
lot about it would be nice to have one table\schema per cache.
Actually, we already have CacheGroup, a very useful feature that allows us
to share memory region/persistence files between caches and resolves many
performance issues.
AFAIK, CacheGroup idea will be present in Ignite 3.0, but may be slightly
reworked regarding a new ideology.

Cache with arbitrary objects was always a headache and this is totally
unusable for SQL.
You ask a good question about pure KV-case. I think we could allow users to
create a schemaless table for arbitrary objects, but it will NOT be
possible to use it in SQL.
There is a big question how to serialize arbitrary objects correctly and
effectively. I suggest storing arbitrary objects as byte[] or kind of
blobs.

Looking forward, users will want to access arbitrary fields of such objects
and will need a BinaryObject interface, but on this step we have to deal
with some schema...
So, I think we can go with one of next ways
* treating KV objects as blob/byte[]
* strict schema approach, schema is created and propagated to grid on first
object put and can never be changed until table destroyed.
* use some kind of BinaryObject with a schema in footer, that may have huge
overhead.

2. One can use SQL or SchemaBuilder or generate schema from Class using
annotations.
I thought it should be possible to pass the result of the schema
builder/generator to initial configuration or to createTable(schema)
method.
What kind of script do you mean?

3. Generally, no. With a strict schema mode it is not possible to do it.
With live-schema mode we could make some trivial conversions e.g. int->long
transparently, but change int->date looks impossible for automatic
conversion in run-time.
Offline conversion utils for persistence files look possible for any
changes.

I think the next sequence of user actions can be applied to change field
type:
* Add new_field with new type. This will up a schema version.
* Change mapping on all nodes nodes to write to new_field, but read from
old_field then convert if needed.
This allows the user to read/convert values saved before the rising schema
version.
* Start scan query to copy/convert old_field data to new one. No node will
write with the old schema version at this point.
* Old_field can be safely dropped. This will up the schema version.
* Rename new_field to old_field via change mapping on nodes. Yes, once
again.
Obviously, users may need to update Java classes for this. In fact, they
will do it twice.

Ok, a rename field command should be added to the schema modification API.

4. We don't need BinaryMarshaller with a new approach. What purpose do you
need a BinaryMarshaller for?
5. Yes, strict schema mode is what you are looking for.


On Tue, Dec 29, 2020 at 10:22 AM Michael Cherkasov <
michael.cherka...@gmail.com> wrote:

> Hi all,
>
> I reviewed the mail thread and proposal page and I still don't fully
> understand what is going to be changed, I would really appreciate it if you
> will answer a few questions:
>
> 1. Are you going to leave only one schema per cache? if so, will be there
> an option to have a table with arbitrary objects(pure KV case)?
> 2. What options will Apache Ignite 3.0 have to define schema? SchemaBuilder
> and SQL only? Is there an option to put the schema definition to the
> configuration?(I really don't like this, I would prefer to have
> separate scripts to create schemas)
> 3. Is there a way to change field type? if yes, can it be done in runtime?
> 4. Looks like BinaryMarshaller is going to be re-worked too, is there any
> IEP for this?
> 5. I don't like automatic schema evaluation when a new field is added
> automatically on record put, so is there a way to prohibit this behavior?
>  I think all schema changes should be done only explicitly except initial
> schema creation.
>
> Thanks,
> Mike.
>
> пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov  >:
>
> > Hi, Igniters.
> >
> > We all know that the current QueryEntity API is not convenient and needs
> to
> > be reworked.
> > So, I'm glad to share PR [1] with schema configuration public API for
> > Ignite 3.0.
> >
> > New schema configuration uses Builder pattern, which looks more
> comfortable
> > to use.
> >
> > In the PR you will find a 'schema' package with the API itself, and a
> draft
> > implementation in 'internal' sub-package,
> > and a test that demonstrates how the API could be used.
> >
> > Please note:
> >
> > * Entrypoint is 'SchemaBuilders' class with static factory methods.
> > * The implementation is decoupled and can be easily extracted to separate
> > module if we decide to do so.
> > * Some columns types (e.g. Date/Time) are missed, they will be added
> lately
> > in separate tickes.
> > * Index configuration extends marker interface that makes possible to
> > implement indexes of new types in plugins.
> > Hopfully, we could add a persistent geo-indices 

Re: IEP-54: Schema-first approach for 3.0

2020-12-29 Thread Andrey Mashenkov
Ilya,

Thanks for feedback.

1. It is a Nested Builder pattern [1]. With a nested builder user will have
a single entry point to build a schema.
When you have a number of builders and one builder need a result of some
another builder, users can be confused and waste a time while looking for
proper implementation.
So, I thought it would be polite to either use a nested builder or have a
factory\helper class for all related builders.
Ok, I've got it.
Mixing nested with chained builders may be non intuitive and your
suggestion (to go the same way with PK as with secondary indices) looks
good.

2. I don't like to pass a builder to another builder because it will look
like you will implicitly use a passed builder.
Actually you can't do anything other than call the "build()" method (of
passed builder) as the method (alretColumn) signature expects some interface
and you can't rely on any expected internal implementation will be ever
passed.
I'd think a Command object should be passed here:
TableModificationBuilder.alretColumn(AlterColumnCommand acc)

Your approach requires an additional interface for Command(s) and some
entry point class to users could find the builder easily.
Thus, I'd proceed with a nested builder approach in that particular case.
JOOQ goes the same way and it looks ok.

3. I think an additional explanation needed here.
We thought that in Ignite 3.0 it should be possible to create schema via
SQL and via pure Java code.
Current QueryEntity has a bad and confusing API, so we decided to rework it
using builders.
Also, we thought about initial schema (as a part of initial configuration)
that can be applied only once on the first grid start and then it can only
be modified.

SchemaTable builder is the way users can create initial
schema/configuration programmatically. It creates some standalone objects
perpesening a schema

Modification builder is the way users can modify the schema like an "ALTER"
SQL command. It creates and applies some sequence of internal commands,
which can be useful (in "live-schema" mode) to create automatic converters
from one schema version to another.
Thus Modification builder produces objects that are wired and interact with
the node internal components.
Modification builders can be requested from Schema manager components that
are out of scope now.

So, you may note these builders are very different, but I think we could
have a similar style in both parts of API and even reuse some
interfaces\objects, As you can see I reuse IndexBuilders.


[1] https://dzone.com/articles/nested-builder



On Tue, Dec 29, 2020 at 11:54 AM Ilya Kazakov 
wrote:

> Hello, Andrei!
>
> I have read this thread and PR. The builder idea and API looks good. But I
> have some questions.
>
> 1.
> In SchemaTableBuilder:
> When I use a builder in chain style, I expect in every step the same result
> type. And as I see this idea is implemented anywhere except pk(), where you
> return PrimryKeyBuilder. As for me, it will be more intuitive if we will
> use something like this:
>
> SchemaTableBuilder.pk(PrimryKeyBuilder pkb)
>
> 2.
> In TableModificationBuilder:
> I have the same question. Some methods returns TableModificationBuilder,
> but alretColumn returns AlterColumnBuilder. As for me, it will be more
> intuitive if we will use something like
>
> TableModificationBuilder.alretColumn(AlterColumnBuilder acb)
>
> But in general, these two points are only a matter of taste.
>
> 3.
> In your examples, I can't understand how we can alter some table, which we
> have already created previously (how we can get its SchemaTable object).
> But maybe this question is out of scope?
>
> --
> Thanks
> Ilya Kazakov
>
> вт, 29 дек. 2020 г. в 15:22, Michael Cherkasov <
> michael.cherka...@gmail.com
> >:
>
> > Hi all,
> >
> > I reviewed the mail thread and proposal page and I still don't fully
> > understand what is going to be changed, I would really appreciate it if
> you
> > will answer a few questions:
> >
> > 1. Are you going to leave only one schema per cache? if so, will be there
> > an option to have a table with arbitrary objects(pure KV case)?
> > 2. What options will Apache Ignite 3.0 have to define schema?
> SchemaBuilder
> > and SQL only? Is there an option to put the schema definition to the
> > configuration?(I really don't like this, I would prefer to have
> > separate scripts to create schemas)
> > 3. Is there a way to change field type? if yes, can it be done in
> runtime?
> > 4. Looks like BinaryMarshaller is going to be re-worked too, is there any
> > IEP for this?
> > 5. I don't like automatic schema evaluation when a new field is added
> > automatically on record put, so is there a way to prohibit this behavior?
> >  I think all schema changes should be done only explicitly except initial
> > schema creation.
> >
> > Thanks,
> > Mike.
> >
> > пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> >
> > > Hi, Igniters.
> > >
> > > We all know that the 

Re: IEP-54: Schema-first approach for 3.0

2020-12-29 Thread Ilya Kasnacheev
Hello!

Is there a way to define schema via ignitectl utility and its
hierarchical properties? How would it look like?

Regards,
-- 
Ilya Kasnacheev


пн, 21 дек. 2020 г. в 17:40, Andrey Mashenkov :

> Hi, Igniters.
>
> We all know that the current QueryEntity API is not convenient and needs to
> be reworked.
> So, I'm glad to share PR [1] with schema configuration public API for
> Ignite 3.0.
>
> New schema configuration uses Builder pattern, which looks more comfortable
> to use.
>
> In the PR you will find a 'schema' package with the API itself, and a draft
> implementation in 'internal' sub-package,
> and a test that demonstrates how the API could be used.
>
> Please note:
>
> * Entrypoint is 'SchemaBuilders' class with static factory methods.
> * The implementation is decoupled and can be easily extracted to separate
> module if we decide to do so.
> * Some columns types (e.g. Date/Time) are missed, they will be added lately
> in separate tickes.
> * Index configuration extends marker interface that makes possible to
> implement indexes of new types in plugins.
> Hopfully, we could add a persistent geo-indices support in future.
> * Supposedly, current table schema can be changed via builder-like
> structure as it is done if JOOQ project. See 'TableModificationBuilder' for
> details.
> I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
> purpose as it is a Schema Manager responsibility to create mutator objects
> from the current schema,
> but implementing the Schema manager is out of scope and will be designed
> within the next task.
> * Interfaces implementations are out of scope. I did not intend to merge
> them right now, but for test/demostration purposes.
>
> It is NOT the final version and some may be changed before the first
> release of course.
> For now, we have to agree if we can proceed with this approach or some
> issues should be resolved at first.
>
> Any thoughts or objections?
> Are interfaces good enough to be merged within the current ticket?
>
>
> https://issues.apache.org/jira/browse/IGNITE-13748
>
> On Thu, Nov 26, 2020 at 2:33 PM Юрий  wrote:
>
> > A little bit my thoughts about unsigned types:
> >
> > 1. Seems we may support unsign types
> > 2. It requires adding new types to the internal representation, protocol,
> > e.t.c.
> > 3. internal representation should be the same as we keep sign types. So
> it
> > will not requires more memory
> > 4. User should be aware of specifics such types for platforms which not
> > support unsigned types. For example, a user could derive -6 value in Java
> > for 250 unsigned byte value (from bits perspective will be right). I
> think
> > We shouldn't use more wide type for such cases, especially it will be bad
> > for unsigned long when we require returns BigInteger type.
> > 5. Possible it requires some suffix/preffix for new types like a '250u' -
> > it means that 250 is an unsigned value type.
> > 6. It requires a little bit more expensive comparison logic for indexes
> > 7. It requires new comparison logic for expressions. I think it not
> > possible for the current H2 engine and probably possible for the new
> > Calcite engine. Need clarification from anybody who involved in this part
> >
> > WDYT?
> >
> > вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk <
> alexey.goncha...@gmail.com
> > >:
> >
> > > Actually, we can support comparisons in 3.0: once we the actual type
> > > information, we can make proper runtime adjustments and conversions to
> > > treat those values as unsigned - it will be just a bit more expensive.
> > >
> > > вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :
> > >
> > > > > SQL range queries it will break
> > > > > WHERE x > y may return wrong results
> > > >
> > > > Yes, range queries, inequality comparisons and so on are broken
> > > > for unsigned data types, I think I mentioned this somewhere above.
> > > >
> > > > Again, in my opinion, we can document that SQL is not supported on
> > those
> > > > types,
> > > > end of story.
> > > >
> > > > On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com>
> > > > wrote:
> > > >
> > > > > Folks, I think this is a reasonable request. I thought about this
> > when
> > > I
> > > > > was drafting the IEP, but hesitated to add these types right away.
> > > > >
> > > > > > That is how it works in Ignite since the beginning with .NET and
> > C++
> > > :)
> > > > > I have some doubts that it actually works as expected, it needs
> some
> > > > > checking (will be glad if my concerns are false):
> > > > >
> > > > >- It's true that equality check works properly, but for SQL
> range
> > > > >queries it will break unless some special care is taken on Java
> > > side:
> > > > > for
> > > > >u8 255 > 10, but in Java (byte)255 will be converted to -1,
> which
> > > will
> > > > >break the comparison. Since we don't have unsigned types now, I
> > > doubt
> > > > it
> > > > >works.
> > > > >- There is an obvious 

Re: IEP-54: Schema-first approach for 3.0

2020-12-29 Thread Ilya Kazakov
Hello, Andrei!

I have read this thread and PR. The builder idea and API looks good. But I
have some questions.

1.
In SchemaTableBuilder:
When I use a builder in chain style, I expect in every step the same result
type. And as I see this idea is implemented anywhere except pk(), where you
return PrimryKeyBuilder. As for me, it will be more intuitive if we will
use something like this:

SchemaTableBuilder.pk(PrimryKeyBuilder pkb)

2.
In TableModificationBuilder:
I have the same question. Some methods returns TableModificationBuilder,
but alretColumn returns AlterColumnBuilder. As for me, it will be more
intuitive if we will use something like

TableModificationBuilder.alretColumn(AlterColumnBuilder acb)

But in general, these two points are only a matter of taste.

3.
In your examples, I can't understand how we can alter some table, which we
have already created previously (how we can get its SchemaTable object).
But maybe this question is out of scope?

--
Thanks
Ilya Kazakov

вт, 29 дек. 2020 г. в 15:22, Michael Cherkasov :

> Hi all,
>
> I reviewed the mail thread and proposal page and I still don't fully
> understand what is going to be changed, I would really appreciate it if you
> will answer a few questions:
>
> 1. Are you going to leave only one schema per cache? if so, will be there
> an option to have a table with arbitrary objects(pure KV case)?
> 2. What options will Apache Ignite 3.0 have to define schema? SchemaBuilder
> and SQL only? Is there an option to put the schema definition to the
> configuration?(I really don't like this, I would prefer to have
> separate scripts to create schemas)
> 3. Is there a way to change field type? if yes, can it be done in runtime?
> 4. Looks like BinaryMarshaller is going to be re-worked too, is there any
> IEP for this?
> 5. I don't like automatic schema evaluation when a new field is added
> automatically on record put, so is there a way to prohibit this behavior?
>  I think all schema changes should be done only explicitly except initial
> schema creation.
>
> Thanks,
> Mike.
>
> пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov  >:
>
> > Hi, Igniters.
> >
> > We all know that the current QueryEntity API is not convenient and needs
> to
> > be reworked.
> > So, I'm glad to share PR [1] with schema configuration public API for
> > Ignite 3.0.
> >
> > New schema configuration uses Builder pattern, which looks more
> comfortable
> > to use.
> >
> > In the PR you will find a 'schema' package with the API itself, and a
> draft
> > implementation in 'internal' sub-package,
> > and a test that demonstrates how the API could be used.
> >
> > Please note:
> >
> > * Entrypoint is 'SchemaBuilders' class with static factory methods.
> > * The implementation is decoupled and can be easily extracted to separate
> > module if we decide to do so.
> > * Some columns types (e.g. Date/Time) are missed, they will be added
> lately
> > in separate tickes.
> > * Index configuration extends marker interface that makes possible to
> > implement indexes of new types in plugins.
> > Hopfully, we could add a persistent geo-indices support in future.
> > * Supposedly, current table schema can be changed via builder-like
> > structure as it is done if JOOQ project. See 'TableModificationBuilder'
> for
> > details.
> > I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
> > purpose as it is a Schema Manager responsibility to create mutator
> objects
> > from the current schema,
> > but implementing the Schema manager is out of scope and will be designed
> > within the next task.
> > * Interfaces implementations are out of scope. I did not intend to merge
> > them right now, but for test/demostration purposes.
> >
> > It is NOT the final version and some may be changed before the first
> > release of course.
> > For now, we have to agree if we can proceed with this approach or some
> > issues should be resolved at first.
> >
> > Any thoughts or objections?
> > Are interfaces good enough to be merged within the current ticket?
> >
> >
> > https://issues.apache.org/jira/browse/IGNITE-13748
> >
> > On Thu, Nov 26, 2020 at 2:33 PM Юрий 
> wrote:
> >
> > > A little bit my thoughts about unsigned types:
> > >
> > > 1. Seems we may support unsign types
> > > 2. It requires adding new types to the internal representation,
> protocol,
> > > e.t.c.
> > > 3. internal representation should be the same as we keep sign types. So
> > it
> > > will not requires more memory
> > > 4. User should be aware of specifics such types for platforms which not
> > > support unsigned types. For example, a user could derive -6 value in
> Java
> > > for 250 unsigned byte value (from bits perspective will be right). I
> > think
> > > We shouldn't use more wide type for such cases, especially it will be
> bad
> > > for unsigned long when we require returns BigInteger type.
> > > 5. Possible it requires some suffix/preffix for new types like a
> '250u' -
> > > it means that 250 is an 

Re: IEP-54: Schema-first approach for 3.0

2020-12-28 Thread Michael Cherkasov
Hi all,

I reviewed the mail thread and proposal page and I still don't fully
understand what is going to be changed, I would really appreciate it if you
will answer a few questions:

1. Are you going to leave only one schema per cache? if so, will be there
an option to have a table with arbitrary objects(pure KV case)?
2. What options will Apache Ignite 3.0 have to define schema? SchemaBuilder
and SQL only? Is there an option to put the schema definition to the
configuration?(I really don't like this, I would prefer to have
separate scripts to create schemas)
3. Is there a way to change field type? if yes, can it be done in runtime?
4. Looks like BinaryMarshaller is going to be re-worked too, is there any
IEP for this?
5. I don't like automatic schema evaluation when a new field is added
automatically on record put, so is there a way to prohibit this behavior?
 I think all schema changes should be done only explicitly except initial
schema creation.

Thanks,
Mike.

пн, 21 дек. 2020 г. в 06:40, Andrey Mashenkov :

> Hi, Igniters.
>
> We all know that the current QueryEntity API is not convenient and needs to
> be reworked.
> So, I'm glad to share PR [1] with schema configuration public API for
> Ignite 3.0.
>
> New schema configuration uses Builder pattern, which looks more comfortable
> to use.
>
> In the PR you will find a 'schema' package with the API itself, and a draft
> implementation in 'internal' sub-package,
> and a test that demonstrates how the API could be used.
>
> Please note:
>
> * Entrypoint is 'SchemaBuilders' class with static factory methods.
> * The implementation is decoupled and can be easily extracted to separate
> module if we decide to do so.
> * Some columns types (e.g. Date/Time) are missed, they will be added lately
> in separate tickes.
> * Index configuration extends marker interface that makes possible to
> implement indexes of new types in plugins.
> Hopfully, we could add a persistent geo-indices support in future.
> * Supposedly, current table schema can be changed via builder-like
> structure as it is done if JOOQ project. See 'TableModificationBuilder' for
> details.
> I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
> purpose as it is a Schema Manager responsibility to create mutator objects
> from the current schema,
> but implementing the Schema manager is out of scope and will be designed
> within the next task.
> * Interfaces implementations are out of scope. I did not intend to merge
> them right now, but for test/demostration purposes.
>
> It is NOT the final version and some may be changed before the first
> release of course.
> For now, we have to agree if we can proceed with this approach or some
> issues should be resolved at first.
>
> Any thoughts or objections?
> Are interfaces good enough to be merged within the current ticket?
>
>
> https://issues.apache.org/jira/browse/IGNITE-13748
>
> On Thu, Nov 26, 2020 at 2:33 PM Юрий  wrote:
>
> > A little bit my thoughts about unsigned types:
> >
> > 1. Seems we may support unsign types
> > 2. It requires adding new types to the internal representation, protocol,
> > e.t.c.
> > 3. internal representation should be the same as we keep sign types. So
> it
> > will not requires more memory
> > 4. User should be aware of specifics such types for platforms which not
> > support unsigned types. For example, a user could derive -6 value in Java
> > for 250 unsigned byte value (from bits perspective will be right). I
> think
> > We shouldn't use more wide type for such cases, especially it will be bad
> > for unsigned long when we require returns BigInteger type.
> > 5. Possible it requires some suffix/preffix for new types like a '250u' -
> > it means that 250 is an unsigned value type.
> > 6. It requires a little bit more expensive comparison logic for indexes
> > 7. It requires new comparison logic for expressions. I think it not
> > possible for the current H2 engine and probably possible for the new
> > Calcite engine. Need clarification from anybody who involved in this part
> >
> > WDYT?
> >
> > вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk <
> alexey.goncha...@gmail.com
> > >:
> >
> > > Actually, we can support comparisons in 3.0: once we the actual type
> > > information, we can make proper runtime adjustments and conversions to
> > > treat those values as unsigned - it will be just a bit more expensive.
> > >
> > > вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :
> > >
> > > > > SQL range queries it will break
> > > > > WHERE x > y may return wrong results
> > > >
> > > > Yes, range queries, inequality comparisons and so on are broken
> > > > for unsigned data types, I think I mentioned this somewhere above.
> > > >
> > > > Again, in my opinion, we can document that SQL is not supported on
> > those
> > > > types,
> > > > end of story.
> > > >
> > > > On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com>
> > > > wrote:
> > > >
> > > > > Folks, I think this 

Re: IEP-54: Schema-first approach for 3.0

2020-12-21 Thread Andrey Mashenkov
Hi, Igniters.

We all know that the current QueryEntity API is not convenient and needs to
be reworked.
So, I'm glad to share PR [1] with schema configuration public API for
Ignite 3.0.

New schema configuration uses Builder pattern, which looks more comfortable
to use.

In the PR you will find a 'schema' package with the API itself, and a draft
implementation in 'internal' sub-package,
and a test that demonstrates how the API could be used.

Please note:

* Entrypoint is 'SchemaBuilders' class with static factory methods.
* The implementation is decoupled and can be easily extracted to separate
module if we decide to do so.
* Some columns types (e.g. Date/Time) are missed, they will be added lately
in separate tickes.
* Index configuration extends marker interface that makes possible to
implement indexes of new types in plugins.
Hopfully, we could add a persistent geo-indices support in future.
* Supposedly, current table schema can be changed via builder-like
structure as it is done if JOOQ project. See 'TableModificationBuilder' for
details.
I'm not sure 'SchemaTable' should have 'toBuilder()' converter for that
purpose as it is a Schema Manager responsibility to create mutator objects
from the current schema,
but implementing the Schema manager is out of scope and will be designed
within the next task.
* Interfaces implementations are out of scope. I did not intend to merge
them right now, but for test/demostration purposes.

It is NOT the final version and some may be changed before the first
release of course.
For now, we have to agree if we can proceed with this approach or some
issues should be resolved at first.

Any thoughts or objections?
Are interfaces good enough to be merged within the current ticket?


https://issues.apache.org/jira/browse/IGNITE-13748

On Thu, Nov 26, 2020 at 2:33 PM Юрий  wrote:

> A little bit my thoughts about unsigned types:
>
> 1. Seems we may support unsign types
> 2. It requires adding new types to the internal representation, protocol,
> e.t.c.
> 3. internal representation should be the same as we keep sign types. So it
> will not requires more memory
> 4. User should be aware of specifics such types for platforms which not
> support unsigned types. For example, a user could derive -6 value in Java
> for 250 unsigned byte value (from bits perspective will be right). I think
> We shouldn't use more wide type for such cases, especially it will be bad
> for unsigned long when we require returns BigInteger type.
> 5. Possible it requires some suffix/preffix for new types like a '250u' -
> it means that 250 is an unsigned value type.
> 6. It requires a little bit more expensive comparison logic for indexes
> 7. It requires new comparison logic for expressions. I think it not
> possible for the current H2 engine and probably possible for the new
> Calcite engine. Need clarification from anybody who involved in this part
>
> WDYT?
>
> вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk  >:
>
> > Actually, we can support comparisons in 3.0: once we the actual type
> > information, we can make proper runtime adjustments and conversions to
> > treat those values as unsigned - it will be just a bit more expensive.
> >
> > вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :
> >
> > > > SQL range queries it will break
> > > > WHERE x > y may return wrong results
> > >
> > > Yes, range queries, inequality comparisons and so on are broken
> > > for unsigned data types, I think I mentioned this somewhere above.
> > >
> > > Again, in my opinion, we can document that SQL is not supported on
> those
> > > types,
> > > end of story.
> > >
> > > On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> > > alexey.goncha...@gmail.com>
> > > wrote:
> > >
> > > > Folks, I think this is a reasonable request. I thought about this
> when
> > I
> > > > was drafting the IEP, but hesitated to add these types right away.
> > > >
> > > > > That is how it works in Ignite since the beginning with .NET and
> C++
> > :)
> > > > I have some doubts that it actually works as expected, it needs some
> > > > checking (will be glad if my concerns are false):
> > > >
> > > >- It's true that equality check works properly, but for SQL range
> > > >queries it will break unless some special care is taken on Java
> > side:
> > > > for
> > > >u8 255 > 10, but in Java (byte)255 will be converted to -1, which
> > will
> > > >break the comparison. Since we don't have unsigned types now, I
> > doubt
> > > it
> > > >works.
> > > >- There is an obvious cross-platform data loss when "intuitive"
> type
> > > >mapping is used by a user (u8 corresponds to byte type in .NET,
> but
> > to
> > > >avoid values loss, a user will have to use short type in Java, and
> > > > Ignite
> > > >will also need to take care of the range check during
> > serialization).
> > > I
> > > >think we can even allow to try to deserialize a value into
> arbitrary
> > > > type,
> > > >but throw an exception if the range 

Re: IEP-54: Schema-first approach for 3.0

2020-11-26 Thread Юрий
A little bit my thoughts about unsigned types:

1. Seems we may support unsign types
2. It requires adding new types to the internal representation, protocol,
e.t.c.
3. internal representation should be the same as we keep sign types. So it
will not requires more memory
4. User should be aware of specifics such types for platforms which not
support unsigned types. For example, a user could derive -6 value in Java
for 250 unsigned byte value (from bits perspective will be right). I think
We shouldn't use more wide type for such cases, especially it will be bad
for unsigned long when we require returns BigInteger type.
5. Possible it requires some suffix/preffix for new types like a '250u' -
it means that 250 is an unsigned value type.
6. It requires a little bit more expensive comparison logic for indexes
7. It requires new comparison logic for expressions. I think it not
possible for the current H2 engine and probably possible for the new
Calcite engine. Need clarification from anybody who involved in this part

WDYT?

вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk :

> Actually, we can support comparisons in 3.0: once we the actual type
> information, we can make proper runtime adjustments and conversions to
> treat those values as unsigned - it will be just a bit more expensive.
>
> вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :
>
> > > SQL range queries it will break
> > > WHERE x > y may return wrong results
> >
> > Yes, range queries, inequality comparisons and so on are broken
> > for unsigned data types, I think I mentioned this somewhere above.
> >
> > Again, in my opinion, we can document that SQL is not supported on those
> > types,
> > end of story.
> >
> > On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> > alexey.goncha...@gmail.com>
> > wrote:
> >
> > > Folks, I think this is a reasonable request. I thought about this when
> I
> > > was drafting the IEP, but hesitated to add these types right away.
> > >
> > > > That is how it works in Ignite since the beginning with .NET and C++
> :)
> > > I have some doubts that it actually works as expected, it needs some
> > > checking (will be glad if my concerns are false):
> > >
> > >- It's true that equality check works properly, but for SQL range
> > >queries it will break unless some special care is taken on Java
> side:
> > > for
> > >u8 255 > 10, but in Java (byte)255 will be converted to -1, which
> will
> > >break the comparison. Since we don't have unsigned types now, I
> doubt
> > it
> > >works.
> > >- There is an obvious cross-platform data loss when "intuitive" type
> > >mapping is used by a user (u8 corresponds to byte type in .NET, but
> to
> > >avoid values loss, a user will have to use short type in Java, and
> > > Ignite
> > >will also need to take care of the range check during
> serialization).
> > I
> > >think we can even allow to try to deserialize a value into arbitrary
> > > type,
> > >but throw an exception if the range is out of bounds.
> > >
> > > Overall, I agree with Andrey's comments.
> > > Andrey, do you mind updating the IEP once all the details are settled
> > here?
> > >
> > > вт, 24 нояб. 2020 г. в 18:19, Andrey Mashenkov <
> > andrey.mashen...@gmail.com
> > > >:
> > >
> > > > Pavel,
> > > >
> > > > I believe uLong values beyond 2^63 can't be treated correctly for now
> > > > (WHERE x > y may return wrong results)
> > > >
> > > > I think we could make "true" support for unsigned types, but they
> will
> > > have
> > > > limitations on the Java side.
> > > > Thus, the one will not be able to map uint64 to Java long primitive,
> > but
> > > to
> > > > BigInteger only.
> > > > As for indices, we could read uint64 to Java long, but treat negative
> > > > values in a different way to preserve correct ordering.
> > > >
> > > > These limitations will affect only mixed environments when .Net and
> > Java
> > > > used to access the data.
> > > > Will this solution address your issues?
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > > That way is impossible.
> > > > >
> > > > > That is how it works in Ignite since the beginning with .NET and
> C++
> > :)
> > > > > You can use unsigned primitives as cache keys and values, as fields
> > and
> > > > > properties,
> > > > > and in SQL queries (even in WHERE x=y clauses) - it works
> > transparently
> > > > for
> > > > > the users.
> > > > > Java side knows nothing and treats those values as corresponding
> > signed
> > > > > types.
> > > > >
> > > > > However, this abstraction leaks in some cases only because there
> are
> > no
> > > > > corresponding type ids.
> > > > > That is why I'm proposing a very simple change to the protocol -
> add
> > > type
> > > > > ids, but handle them the same way as signed counterparts.
> > > > >
> > > > >
> > > > > On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Denis Magda
Andrey,

Understood, thanks. Agree with the idea of creating a task for GraalVM
support. Just want to be sure there won't be any fundamental limitations in
the new serialization protocol that will make it hard or impossible to
generate native images. Probably, we should specify the requirements - at
least the native image support should exist for thin clients. Thick and
servers are optional. What do you think?

-
Denis


On Tue, Nov 24, 2020 at 1:28 AM Andrey Mashenkov 
wrote:

> Denis,
>
> Good point. Both serializers use reflection API.
> However, we will allow users to configure static schema along with 'strict'
> schema mode, we still need to validate user classes on client nodes against
> the latest schema in the grid  and reflection API is the only way to do it.
> One can find a few articles on the internet on how to enable reflection in
> GraalVM.
>
> I'll create a task for supporting GraalVM, and maybe someone who is
> familiar with GraalVM will suggest a solution or a proper workaround. Or
> I'll do it a bit later.
> If no workaround is found, we could allow users to write it's own
> serializer, but I don't think it is a good idea to expose any internal
> classes to the public.
>
> On Tue, Nov 24, 2020 at 2:55 AM Denis Magda  wrote:
>
> > Andrey, thanks for the update,
> >
> > Does any of the serializers take into consideration the
> > native-image-generation feature of GraalVM?
> > https://www.graalvm.org/reference-manual/native-image/
> >
> > With the current binary marshaller, we can't even generate a native image
> > for the code using our thin client APIs.
> >
> > -
> > Denis
> >
> >
> > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Hi Igniters,
> > >
> > > I'd like to continue discussion of IEP-54 (Schema-first approach).
> > >
> > > Hope everyone who is interested had a chance to get familiar with the
> > > proposal [1].
> > > Please, do not hesitate to ask questions and share your ideas.
> > >
> > > I've prepared a prototype of serializer [2] for the data layout
> described
> > > in the proposal.
> > > In prototy, I compared 2 approaches to (de)serialize objects, the first
> > one
> > > uses java reflection/unsafe API and similar to one we already use in
> > Ignite
> > > and the second one generates serializer for particular user class and
> > uses
> > > Janino library for compilation.
> > > Second one shows better results in benchmarks.
> > > I think we can go with it as default serializer and have
> reflection-based
> > > implementation as a fallback if someone will have issues with the first
> > > one.
> > > WDYT?
> > >
> > > There are a number of tasks under the umbrella ticket [3] waiting for
> the
> > > assignee.
> > >
> > > BTW, I'm going to create more tickets for schema manager modes
> > > implementation, but would like to clarify some details.
> > >
> > > I thought schemaManager on each node should held:
> > >   1. Local mapping of "schema version" <--> validated local key/value
> > > classes pair.
> > >   2. Cluster-wide schema changes history.
> > > On the client side. Before any key-value API operation we should
> > validate a
> > > schema for a given key-value pair.
> > > If there is no local-mapping exists for a given key-value pair or if a
> > > cluster wide schema has a more recent version then the key-value pair
> > > should be validated against the latest version and local mapping should
> > be
> > > updated/actualized.
> > > If an object doesn't fit to the latest schema then it depends on schema
> > > mode: either fail the operation ('strict' mode) or a new mapping should
> > be
> > > created and a new schema version should be propagated to the cluster.
> > >
> > > On the server side we usually have no key-value classes and we operate
> > with
> > > tuples.
> > > As schema change history is available and a tuple has schema version,
> > then
> > > it is possible to upgrade any received tuple to the last version
> without
> > > desialization.
> > > Thus we could allow nodes to send key-value pairs of previous versions
> > (if
> > > they didn't receive a schema update yet) without reverting schema
> changes
> > > made by a node with newer classes.
> > >
> > > Alex, Val, Ivan did you mean the same?
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > > [2] https://github.com/apache/ignite/tree/ignite-13618/modules/commons
> > > [3] https://issues.apache.org/jira/browse/IGNITE-13616
> > >
> > > On Thu, Sep 17, 2020 at 9:21 AM Ivan Pavlukhin 
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > Please do not ignore history. We had a thread [1] with many bright
> > > > ideas. We can resume it.
> > > >
> > > > [1]
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html
> > > >
> > > > 2020-09-10 0:08 GMT+03:00, Denis Magda :
> > > > > Val, makes sense, thanks for explaining.

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Alexey Goncharuk
Actually, we can support comparisons in 3.0: once we the actual type
information, we can make proper runtime adjustments and conversions to
treat those values as unsigned - it will be just a bit more expensive.

вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :

> > SQL range queries it will break
> > WHERE x > y may return wrong results
>
> Yes, range queries, inequality comparisons and so on are broken
> for unsigned data types, I think I mentioned this somewhere above.
>
> Again, in my opinion, we can document that SQL is not supported on those
> types,
> end of story.
>
> On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> alexey.goncha...@gmail.com>
> wrote:
>
> > Folks, I think this is a reasonable request. I thought about this when I
> > was drafting the IEP, but hesitated to add these types right away.
> >
> > > That is how it works in Ignite since the beginning with .NET and C++ :)
> > I have some doubts that it actually works as expected, it needs some
> > checking (will be glad if my concerns are false):
> >
> >- It's true that equality check works properly, but for SQL range
> >queries it will break unless some special care is taken on Java side:
> > for
> >u8 255 > 10, but in Java (byte)255 will be converted to -1, which will
> >break the comparison. Since we don't have unsigned types now, I doubt
> it
> >works.
> >- There is an obvious cross-platform data loss when "intuitive" type
> >mapping is used by a user (u8 corresponds to byte type in .NET, but to
> >avoid values loss, a user will have to use short type in Java, and
> > Ignite
> >will also need to take care of the range check during serialization).
> I
> >think we can even allow to try to deserialize a value into arbitrary
> > type,
> >but throw an exception if the range is out of bounds.
> >
> > Overall, I agree with Andrey's comments.
> > Andrey, do you mind updating the IEP once all the details are settled
> here?
> >
> > вт, 24 нояб. 2020 г. в 18:19, Andrey Mashenkov <
> andrey.mashen...@gmail.com
> > >:
> >
> > > Pavel,
> > >
> > > I believe uLong values beyond 2^63 can't be treated correctly for now
> > > (WHERE x > y may return wrong results)
> > >
> > > I think we could make "true" support for unsigned types, but they will
> > have
> > > limitations on the Java side.
> > > Thus, the one will not be able to map uint64 to Java long primitive,
> but
> > to
> > > BigInteger only.
> > > As for indices, we could read uint64 to Java long, but treat negative
> > > values in a different way to preserve correct ordering.
> > >
> > > These limitations will affect only mixed environments when .Net and
> Java
> > > used to access the data.
> > > Will this solution address your issues?
> > >
> > >
> > > On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > > That way is impossible.
> > > >
> > > > That is how it works in Ignite since the beginning with .NET and C++
> :)
> > > > You can use unsigned primitives as cache keys and values, as fields
> and
> > > > properties,
> > > > and in SQL queries (even in WHERE x=y clauses) - it works
> transparently
> > > for
> > > > the users.
> > > > Java side knows nothing and treats those values as corresponding
> signed
> > > > types.
> > > >
> > > > However, this abstraction leaks in some cases only because there are
> no
> > > > corresponding type ids.
> > > > That is why I'm proposing a very simple change to the protocol - add
> > type
> > > > ids, but handle them the same way as signed counterparts.
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Pavel,
> > > > >
> > > > > - Treat uLong as long in Java (bitwise representation is the same)
> > > > >
> > > > > That way is impossible.
> > > > >
> > > > > Assume, you have a .NET class with a uByte field and map it to
> > 'uint8'
> > > > > column.
> > > > > Then you set the field value to "250" and put the object into a
> > table,
> > > > > field value perfectly fits to a single byte 'int8' column.
> > > > > But in Java you can't deserialize it to directly the Java object
> > field
> > > of
> > > > > 'byte' type, so we should map uint8 type to Java 'short' type
> > > > > because the one expected to see "250" as a value which doesn't fit
> to
> > > the
> > > > > signed type.
> > > > > For uLong the one will need a BigInteger field in Java.
> > > > >
> > > > > SQL index either can't treat column value as Java 'byte' as is,
> > because
> > > > > after reading you will get a negative value, so it should be cast
> to
> > > > short
> > > > > at first. (converted to BigInteger for uint64)
> > > > > So, index on signed type will require a different comparator.
> > > > >
> > > > > That way doesn't look simpler.
> > > > >
> > > > > On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn <
> ptupit...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Andrey,
> > > > > >
> > > > > > I don't think range 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
> SQL range queries it will break
> WHERE x > y may return wrong results

Yes, range queries, inequality comparisons and so on are broken
for unsigned data types, I think I mentioned this somewhere above.

Again, in my opinion, we can document that SQL is not supported on those
types,
end of story.

On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk 
wrote:

> Folks, I think this is a reasonable request. I thought about this when I
> was drafting the IEP, but hesitated to add these types right away.
>
> > That is how it works in Ignite since the beginning with .NET and C++ :)
> I have some doubts that it actually works as expected, it needs some
> checking (will be glad if my concerns are false):
>
>- It's true that equality check works properly, but for SQL range
>queries it will break unless some special care is taken on Java side:
> for
>u8 255 > 10, but in Java (byte)255 will be converted to -1, which will
>break the comparison. Since we don't have unsigned types now, I doubt it
>works.
>- There is an obvious cross-platform data loss when "intuitive" type
>mapping is used by a user (u8 corresponds to byte type in .NET, but to
>avoid values loss, a user will have to use short type in Java, and
> Ignite
>will also need to take care of the range check during serialization). I
>think we can even allow to try to deserialize a value into arbitrary
> type,
>but throw an exception if the range is out of bounds.
>
> Overall, I agree with Andrey's comments.
> Andrey, do you mind updating the IEP once all the details are settled here?
>
> вт, 24 нояб. 2020 г. в 18:19, Andrey Mashenkov  >:
>
> > Pavel,
> >
> > I believe uLong values beyond 2^63 can't be treated correctly for now
> > (WHERE x > y may return wrong results)
> >
> > I think we could make "true" support for unsigned types, but they will
> have
> > limitations on the Java side.
> > Thus, the one will not be able to map uint64 to Java long primitive, but
> to
> > BigInteger only.
> > As for indices, we could read uint64 to Java long, but treat negative
> > values in a different way to preserve correct ordering.
> >
> > These limitations will affect only mixed environments when .Net and Java
> > used to access the data.
> > Will this solution address your issues?
> >
> >
> > On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn 
> > wrote:
> >
> > > > That way is impossible.
> > >
> > > That is how it works in Ignite since the beginning with .NET and C++ :)
> > > You can use unsigned primitives as cache keys and values, as fields and
> > > properties,
> > > and in SQL queries (even in WHERE x=y clauses) - it works transparently
> > for
> > > the users.
> > > Java side knows nothing and treats those values as corresponding signed
> > > types.
> > >
> > > However, this abstraction leaks in some cases only because there are no
> > > corresponding type ids.
> > > That is why I'm proposing a very simple change to the protocol - add
> type
> > > ids, but handle them the same way as signed counterparts.
> > >
> > >
> > > On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > > Pavel,
> > > >
> > > > - Treat uLong as long in Java (bitwise representation is the same)
> > > >
> > > > That way is impossible.
> > > >
> > > > Assume, you have a .NET class with a uByte field and map it to
> 'uint8'
> > > > column.
> > > > Then you set the field value to "250" and put the object into a
> table,
> > > > field value perfectly fits to a single byte 'int8' column.
> > > > But in Java you can't deserialize it to directly the Java object
> field
> > of
> > > > 'byte' type, so we should map uint8 type to Java 'short' type
> > > > because the one expected to see "250" as a value which doesn't fit to
> > the
> > > > signed type.
> > > > For uLong the one will need a BigInteger field in Java.
> > > >
> > > > SQL index either can't treat column value as Java 'byte' as is,
> because
> > > > after reading you will get a negative value, so it should be cast to
> > > short
> > > > at first. (converted to BigInteger for uint64)
> > > > So, index on signed type will require a different comparator.
> > > >
> > > > That way doesn't look simpler.
> > > >
> > > > On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > Andrey,
> > > > >
> > > > > I don't think range narrowing is a good idea.
> > > > > Do you see any problems with the simple approach I described?
> > > > >
> > > > >
> > > > > On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pavel,
> > > > > >
> > > > > > If you are ok with narrowing range for unsigned types then we
> could
> > > > add a
> > > > > > constraint for unsigned types on schema level (like nullability
> > flag)
> > > > > > and treat them as signed types in storage.
> > > > > >
> > > > > > We are going with a separate storage type-system and binary
> > protocol
> > > > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Alexey Goncharuk
Folks, I think this is a reasonable request. I thought about this when I
was drafting the IEP, but hesitated to add these types right away.

> That is how it works in Ignite since the beginning with .NET and C++ :)
I have some doubts that it actually works as expected, it needs some
checking (will be glad if my concerns are false):

   - It's true that equality check works properly, but for SQL range
   queries it will break unless some special care is taken on Java side: for
   u8 255 > 10, but in Java (byte)255 will be converted to -1, which will
   break the comparison. Since we don't have unsigned types now, I doubt it
   works.
   - There is an obvious cross-platform data loss when "intuitive" type
   mapping is used by a user (u8 corresponds to byte type in .NET, but to
   avoid values loss, a user will have to use short type in Java, and Ignite
   will also need to take care of the range check during serialization). I
   think we can even allow to try to deserialize a value into arbitrary type,
   but throw an exception if the range is out of bounds.

Overall, I agree with Andrey's comments.
Andrey, do you mind updating the IEP once all the details are settled here?

вт, 24 нояб. 2020 г. в 18:19, Andrey Mashenkov :

> Pavel,
>
> I believe uLong values beyond 2^63 can't be treated correctly for now
> (WHERE x > y may return wrong results)
>
> I think we could make "true" support for unsigned types, but they will have
> limitations on the Java side.
> Thus, the one will not be able to map uint64 to Java long primitive, but to
> BigInteger only.
> As for indices, we could read uint64 to Java long, but treat negative
> values in a different way to preserve correct ordering.
>
> These limitations will affect only mixed environments when .Net and Java
> used to access the data.
> Will this solution address your issues?
>
>
> On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn 
> wrote:
>
> > > That way is impossible.
> >
> > That is how it works in Ignite since the beginning with .NET and C++ :)
> > You can use unsigned primitives as cache keys and values, as fields and
> > properties,
> > and in SQL queries (even in WHERE x=y clauses) - it works transparently
> for
> > the users.
> > Java side knows nothing and treats those values as corresponding signed
> > types.
> >
> > However, this abstraction leaks in some cases only because there are no
> > corresponding type ids.
> > That is why I'm proposing a very simple change to the protocol - add type
> > ids, but handle them the same way as signed counterparts.
> >
> >
> > On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Pavel,
> > >
> > > - Treat uLong as long in Java (bitwise representation is the same)
> > >
> > > That way is impossible.
> > >
> > > Assume, you have a .NET class with a uByte field and map it to 'uint8'
> > > column.
> > > Then you set the field value to "250" and put the object into a table,
> > > field value perfectly fits to a single byte 'int8' column.
> > > But in Java you can't deserialize it to directly the Java object field
> of
> > > 'byte' type, so we should map uint8 type to Java 'short' type
> > > because the one expected to see "250" as a value which doesn't fit to
> the
> > > signed type.
> > > For uLong the one will need a BigInteger field in Java.
> > >
> > > SQL index either can't treat column value as Java 'byte' as is, because
> > > after reading you will get a negative value, so it should be cast to
> > short
> > > at first. (converted to BigInteger for uint64)
> > > So, index on signed type will require a different comparator.
> > >
> > > That way doesn't look simpler.
> > >
> > > On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > Andrey,
> > > >
> > > > I don't think range narrowing is a good idea.
> > > > Do you see any problems with the simple approach I described?
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Pavel,
> > > > >
> > > > > If you are ok with narrowing range for unsigned types then we could
> > > add a
> > > > > constraint for unsigned types on schema level (like nullability
> flag)
> > > > > and treat them as signed types in storage.
> > > > >
> > > > > We are going with a separate storage type-system and binary
> protocol
> > > > > type-system, however most of type will match 1 to 1 with storage
> > > (native)
> > > > > type.
> > > > > On .Net side you will either have a separate type id or treat
> > > serialized
> > > > > value regarding a schema (signed or unsigned flag).
> > > > >
> > > > > Igor,
> > > > >
> > > > > I'm not sure users can ever foresee the consequences of using
> > unsigned
> > > > > types.
> > > > >
> > > > > Assume, a user used to unsigned types perfectly works with some
> > > database,
> > > > > then he turns into Ignite successor confession with our "native"
> > > > > unsigned-types support.
> > > > > But 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Andrey Mashenkov
Pavel,

I believe uLong values beyond 2^63 can't be treated correctly for now
(WHERE x > y may return wrong results)

I think we could make "true" support for unsigned types, but they will have
limitations on the Java side.
Thus, the one will not be able to map uint64 to Java long primitive, but to
BigInteger only.
As for indices, we could read uint64 to Java long, but treat negative
values in a different way to preserve correct ordering.

These limitations will affect only mixed environments when .Net and Java
used to access the data.
Will this solution address your issues?


On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn  wrote:

> > That way is impossible.
>
> That is how it works in Ignite since the beginning with .NET and C++ :)
> You can use unsigned primitives as cache keys and values, as fields and
> properties,
> and in SQL queries (even in WHERE x=y clauses) - it works transparently for
> the users.
> Java side knows nothing and treats those values as corresponding signed
> types.
>
> However, this abstraction leaks in some cases only because there are no
> corresponding type ids.
> That is why I'm proposing a very simple change to the protocol - add type
> ids, but handle them the same way as signed counterparts.
>
>
> On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> > Pavel,
> >
> > - Treat uLong as long in Java (bitwise representation is the same)
> >
> > That way is impossible.
> >
> > Assume, you have a .NET class with a uByte field and map it to 'uint8'
> > column.
> > Then you set the field value to "250" and put the object into a table,
> > field value perfectly fits to a single byte 'int8' column.
> > But in Java you can't deserialize it to directly the Java object field of
> > 'byte' type, so we should map uint8 type to Java 'short' type
> > because the one expected to see "250" as a value which doesn't fit to the
> > signed type.
> > For uLong the one will need a BigInteger field in Java.
> >
> > SQL index either can't treat column value as Java 'byte' as is, because
> > after reading you will get a negative value, so it should be cast to
> short
> > at first. (converted to BigInteger for uint64)
> > So, index on signed type will require a different comparator.
> >
> > That way doesn't look simpler.
> >
> > On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Andrey,
> > >
> > > I don't think range narrowing is a good idea.
> > > Do you see any problems with the simple approach I described?
> > >
> > >
> > > On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > > Pavel,
> > > >
> > > > If you are ok with narrowing range for unsigned types then we could
> > add a
> > > > constraint for unsigned types on schema level (like nullability flag)
> > > > and treat them as signed types in storage.
> > > >
> > > > We are going with a separate storage type-system and binary protocol
> > > > type-system, however most of type will match 1 to 1 with storage
> > (native)
> > > > type.
> > > > On .Net side you will either have a separate type id or treat
> > serialized
> > > > value regarding a schema (signed or unsigned flag).
> > > >
> > > > Igor,
> > > >
> > > > I'm not sure users can ever foresee the consequences of using
> unsigned
> > > > types.
> > > >
> > > > Assume, a user used to unsigned types perfectly works with some
> > database,
> > > > then he turns into Ignite successor confession with our "native"
> > > > unsigned-types support.
> > > > But later, he finds that he can use the power of Ignite Compute on
> Java
> > > for
> > > > some tasks or a new app.
> > > > Finally, the user will either fail to use his unsigned data on Java
> due
> > > or
> > > > face performance issues due to natural Java type system limitations
> > e.g.
> > > > conversion uLong to BigInteger.
> > > >
> > > > I believe that natively supported types with possible value ranges
> and
> > > > limitations should be known.
> > > > So, the only question is what trade-off we found acceptable:
> narrowing
> > > > unsigned type range or use types of wider range on systems like Java.
> > > >
> > > > On Tue, Nov 24, 2020 at 3:25 PM Igor Sapego 
> > wrote:
> > > >
> > > > > Actually, I think it is not so hard to implement comparison of
> > unsigned
> > > > > numbers in
> > > > > SQL even in Java, so it does not seem to be a big issue from my
> > > > > perspective.
> > > > >
> > > > > Now to the usage of unsigned types from Java - I think, if a user
> > uses
> > > > > unsigned type
> > > > > in a schema and is going to interact with it from Java he knows
> what
> > he
> > > > is
> > > > > doing.
> > > > >
> > > > > Mostly they are for use from platforms where they have native
> support
> > > and
> > > > > widely
> > > > > used, like in C++ or .NET, where users currently have to make a
> > manual
> > > > type
> > > > > casting
> > > > > or even just stop using unsigned types when they use Ignite.
> > > > >
> > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
> That way is impossible.

That is how it works in Ignite since the beginning with .NET and C++ :)
You can use unsigned primitives as cache keys and values, as fields and
properties,
and in SQL queries (even in WHERE x=y clauses) - it works transparently for
the users.
Java side knows nothing and treats those values as corresponding signed
types.

However, this abstraction leaks in some cases only because there are no
corresponding type ids.
That is why I'm proposing a very simple change to the protocol - add type
ids, but handle them the same way as signed counterparts.


On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov 
wrote:

> Pavel,
>
> - Treat uLong as long in Java (bitwise representation is the same)
>
> That way is impossible.
>
> Assume, you have a .NET class with a uByte field and map it to 'uint8'
> column.
> Then you set the field value to "250" and put the object into a table,
> field value perfectly fits to a single byte 'int8' column.
> But in Java you can't deserialize it to directly the Java object field of
> 'byte' type, so we should map uint8 type to Java 'short' type
> because the one expected to see "250" as a value which doesn't fit to the
> signed type.
> For uLong the one will need a BigInteger field in Java.
>
> SQL index either can't treat column value as Java 'byte' as is, because
> after reading you will get a negative value, so it should be cast to short
> at first. (converted to BigInteger for uint64)
> So, index on signed type will require a different comparator.
>
> That way doesn't look simpler.
>
> On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn 
> wrote:
>
> > Andrey,
> >
> > I don't think range narrowing is a good idea.
> > Do you see any problems with the simple approach I described?
> >
> >
> > On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Pavel,
> > >
> > > If you are ok with narrowing range for unsigned types then we could
> add a
> > > constraint for unsigned types on schema level (like nullability flag)
> > > and treat them as signed types in storage.
> > >
> > > We are going with a separate storage type-system and binary protocol
> > > type-system, however most of type will match 1 to 1 with storage
> (native)
> > > type.
> > > On .Net side you will either have a separate type id or treat
> serialized
> > > value regarding a schema (signed or unsigned flag).
> > >
> > > Igor,
> > >
> > > I'm not sure users can ever foresee the consequences of using unsigned
> > > types.
> > >
> > > Assume, a user used to unsigned types perfectly works with some
> database,
> > > then he turns into Ignite successor confession with our "native"
> > > unsigned-types support.
> > > But later, he finds that he can use the power of Ignite Compute on Java
> > for
> > > some tasks or a new app.
> > > Finally, the user will either fail to use his unsigned data on Java due
> > or
> > > face performance issues due to natural Java type system limitations
> e.g.
> > > conversion uLong to BigInteger.
> > >
> > > I believe that natively supported types with possible value ranges and
> > > limitations should be known.
> > > So, the only question is what trade-off we found acceptable: narrowing
> > > unsigned type range or use types of wider range on systems like Java.
> > >
> > > On Tue, Nov 24, 2020 at 3:25 PM Igor Sapego 
> wrote:
> > >
> > > > Actually, I think it is not so hard to implement comparison of
> unsigned
> > > > numbers in
> > > > SQL even in Java, so it does not seem to be a big issue from my
> > > > perspective.
> > > >
> > > > Now to the usage of unsigned types from Java - I think, if a user
> uses
> > > > unsigned type
> > > > in a schema and is going to interact with it from Java he knows what
> he
> > > is
> > > > doing.
> > > >
> > > > Mostly they are for use from platforms where they have native support
> > and
> > > > widely
> > > > used, like in C++ or .NET, where users currently have to make a
> manual
> > > type
> > > > casting
> > > > or even just stop using unsigned types when they use Ignite.
> > > >
> > > > Best Regards,
> > > > Igor
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 3:06 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > Andrey,
> > > > >
> > > > > I think it is much simpler:
> > > > > - Add protocol support for those types (basically, just add more
> type
> > > > ids)
> > > > > - Treat uLong as long in Java (bitwise representation is the same)
> > > > >
> > > > > ANSI SQL does not have unsigned integers, so we can simply say that
> > > > > unsigned value relative comparison is not supported in SQL
> (equality
> > > will
> > > > > work).
> > > > >
> > > > >
> > > > > On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks, Pavel and Igor.
> > > > > >
> > > > > > I like your ideas to have i8 or int8 instead of Integer.
> > > > > > But the naming doesn't address the issue.
> > > > > >
> > > > > > I agree internal 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Andrey Mashenkov
Pavel,

- Treat uLong as long in Java (bitwise representation is the same)

That way is impossible.

Assume, you have a .NET class with a uByte field and map it to 'uint8'
column.
Then you set the field value to "250" and put the object into a table,
field value perfectly fits to a single byte 'int8' column.
But in Java you can't deserialize it to directly the Java object field of
'byte' type, so we should map uint8 type to Java 'short' type
because the one expected to see "250" as a value which doesn't fit to the
signed type.
For uLong the one will need a BigInteger field in Java.

SQL index either can't treat column value as Java 'byte' as is, because
after reading you will get a negative value, so it should be cast to short
at first. (converted to BigInteger for uint64)
So, index on signed type will require a different comparator.

That way doesn't look simpler.

On Tue, Nov 24, 2020 at 4:23 PM Pavel Tupitsyn  wrote:

> Andrey,
>
> I don't think range narrowing is a good idea.
> Do you see any problems with the simple approach I described?
>
>
> On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> > Pavel,
> >
> > If you are ok with narrowing range for unsigned types then we could add a
> > constraint for unsigned types on schema level (like nullability flag)
> > and treat them as signed types in storage.
> >
> > We are going with a separate storage type-system and binary protocol
> > type-system, however most of type will match 1 to 1 with storage (native)
> > type.
> > On .Net side you will either have a separate type id or treat serialized
> > value regarding a schema (signed or unsigned flag).
> >
> > Igor,
> >
> > I'm not sure users can ever foresee the consequences of using unsigned
> > types.
> >
> > Assume, a user used to unsigned types perfectly works with some database,
> > then he turns into Ignite successor confession with our "native"
> > unsigned-types support.
> > But later, he finds that he can use the power of Ignite Compute on Java
> for
> > some tasks or a new app.
> > Finally, the user will either fail to use his unsigned data on Java due
> or
> > face performance issues due to natural Java type system limitations e.g.
> > conversion uLong to BigInteger.
> >
> > I believe that natively supported types with possible value ranges and
> > limitations should be known.
> > So, the only question is what trade-off we found acceptable: narrowing
> > unsigned type range or use types of wider range on systems like Java.
> >
> > On Tue, Nov 24, 2020 at 3:25 PM Igor Sapego  wrote:
> >
> > > Actually, I think it is not so hard to implement comparison of unsigned
> > > numbers in
> > > SQL even in Java, so it does not seem to be a big issue from my
> > > perspective.
> > >
> > > Now to the usage of unsigned types from Java - I think, if a user uses
> > > unsigned type
> > > in a schema and is going to interact with it from Java he knows what he
> > is
> > > doing.
> > >
> > > Mostly they are for use from platforms where they have native support
> and
> > > widely
> > > used, like in C++ or .NET, where users currently have to make a manual
> > type
> > > casting
> > > or even just stop using unsigned types when they use Ignite.
> > >
> > > Best Regards,
> > > Igor
> > >
> > >
> > > On Tue, Nov 24, 2020 at 3:06 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > Andrey,
> > > >
> > > > I think it is much simpler:
> > > > - Add protocol support for those types (basically, just add more type
> > > ids)
> > > > - Treat uLong as long in Java (bitwise representation is the same)
> > > >
> > > > ANSI SQL does not have unsigned integers, so we can simply say that
> > > > unsigned value relative comparison is not supported in SQL (equality
> > will
> > > > work).
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks, Pavel and Igor.
> > > > >
> > > > > I like your ideas to have i8 or int8 instead of Integer.
> > > > > But the naming doesn't address the issue.
> > > > >
> > > > > I agree internal types should be portable across different systems
> > with
> > > > and
> > > > > without unsigned type support.
> > > > > The only issue here is that unsigned types cover different ranges.
> > > > >
> > > > > Let's assume we want to introduce a uLong.
> > > > > It doesn't look like a big deal to add uLong type support at
> storage
> > > > level
> > > > > and fit it to a 8 bytes and then use it in e.g. .Net only.
> > > > > But how we could support it in e.g. Java?
> > > > >
> > > > > Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong
> > range
> > > is
> > > > > (0 .. 2^64)
> > > > > 1. The first option is to restrict range to (0 .. 2^63). This
> allows
> > to
> > > > use
> > > > > signed in e.g.
> > > > > Java with no conversion, but doesn't look like a 'real' unsigned
> > uLong
> > > > > support. Things go worse when the user will use uByte, as
> limitation
> > > can
> > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
Andrey,

I don't think range narrowing is a good idea.
Do you see any problems with the simple approach I described?


On Tue, Nov 24, 2020 at 4:01 PM Andrey Mashenkov 
wrote:

> Pavel,
>
> If you are ok with narrowing range for unsigned types then we could add a
> constraint for unsigned types on schema level (like nullability flag)
> and treat them as signed types in storage.
>
> We are going with a separate storage type-system and binary protocol
> type-system, however most of type will match 1 to 1 with storage (native)
> type.
> On .Net side you will either have a separate type id or treat serialized
> value regarding a schema (signed or unsigned flag).
>
> Igor,
>
> I'm not sure users can ever foresee the consequences of using unsigned
> types.
>
> Assume, a user used to unsigned types perfectly works with some database,
> then he turns into Ignite successor confession with our "native"
> unsigned-types support.
> But later, he finds that he can use the power of Ignite Compute on Java for
> some tasks or a new app.
> Finally, the user will either fail to use his unsigned data on Java due or
> face performance issues due to natural Java type system limitations e.g.
> conversion uLong to BigInteger.
>
> I believe that natively supported types with possible value ranges and
> limitations should be known.
> So, the only question is what trade-off we found acceptable: narrowing
> unsigned type range or use types of wider range on systems like Java.
>
> On Tue, Nov 24, 2020 at 3:25 PM Igor Sapego  wrote:
>
> > Actually, I think it is not so hard to implement comparison of unsigned
> > numbers in
> > SQL even in Java, so it does not seem to be a big issue from my
> > perspective.
> >
> > Now to the usage of unsigned types from Java - I think, if a user uses
> > unsigned type
> > in a schema and is going to interact with it from Java he knows what he
> is
> > doing.
> >
> > Mostly they are for use from platforms where they have native support and
> > widely
> > used, like in C++ or .NET, where users currently have to make a manual
> type
> > casting
> > or even just stop using unsigned types when they use Ignite.
> >
> > Best Regards,
> > Igor
> >
> >
> > On Tue, Nov 24, 2020 at 3:06 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Andrey,
> > >
> > > I think it is much simpler:
> > > - Add protocol support for those types (basically, just add more type
> > ids)
> > > - Treat uLong as long in Java (bitwise representation is the same)
> > >
> > > ANSI SQL does not have unsigned integers, so we can simply say that
> > > unsigned value relative comparison is not supported in SQL (equality
> will
> > > work).
> > >
> > >
> > > On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > > Thanks, Pavel and Igor.
> > > >
> > > > I like your ideas to have i8 or int8 instead of Integer.
> > > > But the naming doesn't address the issue.
> > > >
> > > > I agree internal types should be portable across different systems
> with
> > > and
> > > > without unsigned type support.
> > > > The only issue here is that unsigned types cover different ranges.
> > > >
> > > > Let's assume we want to introduce a uLong.
> > > > It doesn't look like a big deal to add uLong type support at storage
> > > level
> > > > and fit it to a 8 bytes and then use it in e.g. .Net only.
> > > > But how we could support it in e.g. Java?
> > > >
> > > > Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong
> range
> > is
> > > > (0 .. 2^64)
> > > > 1. The first option is to restrict range to (0 .. 2^63). This allows
> to
> > > use
> > > > signed in e.g.
> > > > Java with no conversion, but doesn't look like a 'real' unsigned
> uLong
> > > > support. Things go worse when the user will use uByte, as limitation
> > can
> > > > make uByte totally unusable.
> > > >
> > > > 2. The second one is to map unsigned types to a type of wider type
> and
> > > add
> > > > a constraint for negative values. E.g. uLong to BigInteger.
> > > > So, we can't use primitive Java type for Long here. However, it is
> > still
> > > > possible to store uLong in 8 bytes, but have a special comparator for
> > > > unsigned types to avoid unwanted deserialization.
> > > >
> > > > WDYT?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 2:04 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > Agree, let's get rid of "long, short, byte" in the protocol
> > definition.
> > > > >
> > > > > We can use Rust style, which is concise and unambiguous:
> > > > > i8, u8, i16, u16, etc
> > > > >
> > > > > On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego 
> > > wrote:
> > > > >
> > > > > > Pavel,
> > > > > >
> > > > > > I totally support that. Also, if we are aiming for
> > > > > > stronger platform-independance,
> > > > > > in our schemas we may want to support bit-notation (int32,
> uint64)?
> > > For
> > > > > > example
> > > > > > "long" can mean a different type on different platforms and it's
> > easy
> > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Andrey Mashenkov
Pavel,

If you are ok with narrowing range for unsigned types then we could add a
constraint for unsigned types on schema level (like nullability flag)
and treat them as signed types in storage.

We are going with a separate storage type-system and binary protocol
type-system, however most of type will match 1 to 1 with storage (native)
type.
On .Net side you will either have a separate type id or treat serialized
value regarding a schema (signed or unsigned flag).

Igor,

I'm not sure users can ever foresee the consequences of using unsigned
types.

Assume, a user used to unsigned types perfectly works with some database,
then he turns into Ignite successor confession with our "native"
unsigned-types support.
But later, he finds that he can use the power of Ignite Compute on Java for
some tasks or a new app.
Finally, the user will either fail to use his unsigned data on Java due or
face performance issues due to natural Java type system limitations e.g.
conversion uLong to BigInteger.

I believe that natively supported types with possible value ranges and
limitations should be known.
So, the only question is what trade-off we found acceptable: narrowing
unsigned type range or use types of wider range on systems like Java.

On Tue, Nov 24, 2020 at 3:25 PM Igor Sapego  wrote:

> Actually, I think it is not so hard to implement comparison of unsigned
> numbers in
> SQL even in Java, so it does not seem to be a big issue from my
> perspective.
>
> Now to the usage of unsigned types from Java - I think, if a user uses
> unsigned type
> in a schema and is going to interact with it from Java he knows what he is
> doing.
>
> Mostly they are for use from platforms where they have native support and
> widely
> used, like in C++ or .NET, where users currently have to make a manual type
> casting
> or even just stop using unsigned types when they use Ignite.
>
> Best Regards,
> Igor
>
>
> On Tue, Nov 24, 2020 at 3:06 PM Pavel Tupitsyn 
> wrote:
>
> > Andrey,
> >
> > I think it is much simpler:
> > - Add protocol support for those types (basically, just add more type
> ids)
> > - Treat uLong as long in Java (bitwise representation is the same)
> >
> > ANSI SQL does not have unsigned integers, so we can simply say that
> > unsigned value relative comparison is not supported in SQL (equality will
> > work).
> >
> >
> > On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Thanks, Pavel and Igor.
> > >
> > > I like your ideas to have i8 or int8 instead of Integer.
> > > But the naming doesn't address the issue.
> > >
> > > I agree internal types should be portable across different systems with
> > and
> > > without unsigned type support.
> > > The only issue here is that unsigned types cover different ranges.
> > >
> > > Let's assume we want to introduce a uLong.
> > > It doesn't look like a big deal to add uLong type support at storage
> > level
> > > and fit it to a 8 bytes and then use it in e.g. .Net only.
> > > But how we could support it in e.g. Java?
> > >
> > > Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong range
> is
> > > (0 .. 2^64)
> > > 1. The first option is to restrict range to (0 .. 2^63). This allows to
> > use
> > > signed in e.g.
> > > Java with no conversion, but doesn't look like a 'real' unsigned uLong
> > > support. Things go worse when the user will use uByte, as limitation
> can
> > > make uByte totally unusable.
> > >
> > > 2. The second one is to map unsigned types to a type of wider type and
> > add
> > > a constraint for negative values. E.g. uLong to BigInteger.
> > > So, we can't use primitive Java type for Long here. However, it is
> still
> > > possible to store uLong in 8 bytes, but have a special comparator for
> > > unsigned types to avoid unwanted deserialization.
> > >
> > > WDYT?
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Nov 24, 2020 at 2:04 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > Agree, let's get rid of "long, short, byte" in the protocol
> definition.
> > > >
> > > > We can use Rust style, which is concise and unambiguous:
> > > > i8, u8, i16, u16, etc
> > > >
> > > > On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego 
> > wrote:
> > > >
> > > > > Pavel,
> > > > >
> > > > > I totally support that. Also, if we are aiming for
> > > > > stronger platform-independance,
> > > > > in our schemas we may want to support bit-notation (int32, uint64)?
> > For
> > > > > example
> > > > > "long" can mean a different type on different platforms and it's
> easy
> > > to
> > > > > confuse
> > > > > them (happens often when using ODBC for example).
> > > > >
> > > > > Best Regards,
> > > > > Igor
> > > > >
> > > > >
> > > > > On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn <
> ptupit...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > I think we should support unsigned data types:
> > > > > > uByte, uShort, uInt, uLong
> > > > > >
> > > > > > Java does not have them, but many other 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Igor Sapego
Actually, I think it is not so hard to implement comparison of unsigned
numbers in
SQL even in Java, so it does not seem to be a big issue from my perspective.

Now to the usage of unsigned types from Java - I think, if a user uses
unsigned type
in a schema and is going to interact with it from Java he knows what he is
doing.

Mostly they are for use from platforms where they have native support and
widely
used, like in C++ or .NET, where users currently have to make a manual type
casting
or even just stop using unsigned types when they use Ignite.

Best Regards,
Igor


On Tue, Nov 24, 2020 at 3:06 PM Pavel Tupitsyn  wrote:

> Andrey,
>
> I think it is much simpler:
> - Add protocol support for those types (basically, just add more type ids)
> - Treat uLong as long in Java (bitwise representation is the same)
>
> ANSI SQL does not have unsigned integers, so we can simply say that
> unsigned value relative comparison is not supported in SQL (equality will
> work).
>
>
> On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> > Thanks, Pavel and Igor.
> >
> > I like your ideas to have i8 or int8 instead of Integer.
> > But the naming doesn't address the issue.
> >
> > I agree internal types should be portable across different systems with
> and
> > without unsigned type support.
> > The only issue here is that unsigned types cover different ranges.
> >
> > Let's assume we want to introduce a uLong.
> > It doesn't look like a big deal to add uLong type support at storage
> level
> > and fit it to a 8 bytes and then use it in e.g. .Net only.
> > But how we could support it in e.g. Java?
> >
> > Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong range is
> > (0 .. 2^64)
> > 1. The first option is to restrict range to (0 .. 2^63). This allows to
> use
> > signed in e.g.
> > Java with no conversion, but doesn't look like a 'real' unsigned uLong
> > support. Things go worse when the user will use uByte, as limitation can
> > make uByte totally unusable.
> >
> > 2. The second one is to map unsigned types to a type of wider type and
> add
> > a constraint for negative values. E.g. uLong to BigInteger.
> > So, we can't use primitive Java type for Long here. However, it is still
> > possible to store uLong in 8 bytes, but have a special comparator for
> > unsigned types to avoid unwanted deserialization.
> >
> > WDYT?
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 24, 2020 at 2:04 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Agree, let's get rid of "long, short, byte" in the protocol definition.
> > >
> > > We can use Rust style, which is concise and unambiguous:
> > > i8, u8, i16, u16, etc
> > >
> > > On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego 
> wrote:
> > >
> > > > Pavel,
> > > >
> > > > I totally support that. Also, if we are aiming for
> > > > stronger platform-independance,
> > > > in our schemas we may want to support bit-notation (int32, uint64)?
> For
> > > > example
> > > > "long" can mean a different type on different platforms and it's easy
> > to
> > > > confuse
> > > > them (happens often when using ODBC for example).
> > > >
> > > > Best Regards,
> > > > Igor
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > I think we should support unsigned data types:
> > > > > uByte, uShort, uInt, uLong
> > > > >
> > > > > Java does not have them, but many other languages do,
> > > > > and with the growing number of thin clients this is important.
> > > > >
> > > > > For example, in current Ignite.NET implementation we store unsigned
> > > > values
> > > > > as signed internally,
> > > > > but this is a huge pain when it comes to metadata, binary objects,
> > etc.
> > > > > (it is easy to deserialize int as uint when you have a class, but
> not
> > > > with
> > > > > BinaryObject.GetField)
> > > > >
> > > > > Any objections?
> > > > >
> > > > > On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com> wrote:
> > > > >
> > > > > > Denis,
> > > > > >
> > > > > > Good point. Both serializers use reflection API.
> > > > > > However, we will allow users to configure static schema along
> with
> > > > > 'strict'
> > > > > > schema mode, we still need to validate user classes on client
> nodes
> > > > > against
> > > > > > the latest schema in the grid  and reflection API is the only way
> > to
> > > do
> > > > > it.
> > > > > > One can find a few articles on the internet on how to enable
> > > reflection
> > > > > in
> > > > > > GraalVM.
> > > > > >
> > > > > > I'll create a task for supporting GraalVM, and maybe someone who
> is
> > > > > > familiar with GraalVM will suggest a solution or a proper
> > workaround.
> > > > Or
> > > > > > I'll do it a bit later.
> > > > > > If no workaround is found, we could allow users to write it's own
> > > > > > serializer, but I don't think it is a good idea to expose any
> > > internal
> > > > > > classes to the public.
> > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
Andrey,

I think it is much simpler:
- Add protocol support for those types (basically, just add more type ids)
- Treat uLong as long in Java (bitwise representation is the same)

ANSI SQL does not have unsigned integers, so we can simply say that
unsigned value relative comparison is not supported in SQL (equality will
work).


On Tue, Nov 24, 2020 at 2:40 PM Andrey Mashenkov 
wrote:

> Thanks, Pavel and Igor.
>
> I like your ideas to have i8 or int8 instead of Integer.
> But the naming doesn't address the issue.
>
> I agree internal types should be portable across different systems with and
> without unsigned type support.
> The only issue here is that unsigned types cover different ranges.
>
> Let's assume we want to introduce a uLong.
> It doesn't look like a big deal to add uLong type support at storage level
> and fit it to a 8 bytes and then use it in e.g. .Net only.
> But how we could support it in e.g. Java?
>
> Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong range is
> (0 .. 2^64)
> 1. The first option is to restrict range to (0 .. 2^63). This allows to use
> signed in e.g.
> Java with no conversion, but doesn't look like a 'real' unsigned uLong
> support. Things go worse when the user will use uByte, as limitation can
> make uByte totally unusable.
>
> 2. The second one is to map unsigned types to a type of wider type and add
> a constraint for negative values. E.g. uLong to BigInteger.
> So, we can't use primitive Java type for Long here. However, it is still
> possible to store uLong in 8 bytes, but have a special comparator for
> unsigned types to avoid unwanted deserialization.
>
> WDYT?
>
>
>
>
>
>
> On Tue, Nov 24, 2020 at 2:04 PM Pavel Tupitsyn 
> wrote:
>
> > Agree, let's get rid of "long, short, byte" in the protocol definition.
> >
> > We can use Rust style, which is concise and unambiguous:
> > i8, u8, i16, u16, etc
> >
> > On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego  wrote:
> >
> > > Pavel,
> > >
> > > I totally support that. Also, if we are aiming for
> > > stronger platform-independance,
> > > in our schemas we may want to support bit-notation (int32, uint64)? For
> > > example
> > > "long" can mean a different type on different platforms and it's easy
> to
> > > confuse
> > > them (happens often when using ODBC for example).
> > >
> > > Best Regards,
> > > Igor
> > >
> > >
> > > On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > I think we should support unsigned data types:
> > > > uByte, uShort, uInt, uLong
> > > >
> > > > Java does not have them, but many other languages do,
> > > > and with the growing number of thin clients this is important.
> > > >
> > > > For example, in current Ignite.NET implementation we store unsigned
> > > values
> > > > as signed internally,
> > > > but this is a huge pain when it comes to metadata, binary objects,
> etc.
> > > > (it is easy to deserialize int as uint when you have a class, but not
> > > with
> > > > BinaryObject.GetField)
> > > >
> > > > Any objections?
> > > >
> > > > On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com> wrote:
> > > >
> > > > > Denis,
> > > > >
> > > > > Good point. Both serializers use reflection API.
> > > > > However, we will allow users to configure static schema along with
> > > > 'strict'
> > > > > schema mode, we still need to validate user classes on client nodes
> > > > against
> > > > > the latest schema in the grid  and reflection API is the only way
> to
> > do
> > > > it.
> > > > > One can find a few articles on the internet on how to enable
> > reflection
> > > > in
> > > > > GraalVM.
> > > > >
> > > > > I'll create a task for supporting GraalVM, and maybe someone who is
> > > > > familiar with GraalVM will suggest a solution or a proper
> workaround.
> > > Or
> > > > > I'll do it a bit later.
> > > > > If no workaround is found, we could allow users to write it's own
> > > > > serializer, but I don't think it is a good idea to expose any
> > internal
> > > > > classes to the public.
> > > > >
> > > > > On Tue, Nov 24, 2020 at 2:55 AM Denis Magda 
> > wrote:
> > > > >
> > > > > > Andrey, thanks for the update,
> > > > > >
> > > > > > Does any of the serializers take into consideration the
> > > > > > native-image-generation feature of GraalVM?
> > > > > > https://www.graalvm.org/reference-manual/native-image/
> > > > > >
> > > > > > With the current binary marshaller, we can't even generate a
> native
> > > > image
> > > > > > for the code using our thin client APIs.
> > > > > >
> > > > > > -
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > > > > > andrey.mashen...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Igniters,
> > > > > > >
> > > > > > > I'd like to continue discussion of IEP-54 (Schema-first
> > approach).
> > > > > > >
> > > > > > > Hope everyone who is interested had a chance to get familiar
> with
> > > the
> > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Andrey Mashenkov
Thanks, Pavel and Igor.

I like your ideas to have i8 or int8 instead of Integer.
But the naming doesn't address the issue.

I agree internal types should be portable across different systems with and
without unsigned type support.
The only issue here is that unsigned types cover different ranges.

Let's assume we want to introduce a uLong.
It doesn't look like a big deal to add uLong type support at storage level
and fit it to a 8 bytes and then use it in e.g. .Net only.
But how we could support it in e.g. Java?

Let's keep in mind Long range is about (2^-63 .. 2^63) and uLong range is
(0 .. 2^64)
1. The first option is to restrict range to (0 .. 2^63). This allows to use
signed in e.g.
Java with no conversion, but doesn't look like a 'real' unsigned uLong
support. Things go worse when the user will use uByte, as limitation can
make uByte totally unusable.

2. The second one is to map unsigned types to a type of wider type and add
a constraint for negative values. E.g. uLong to BigInteger.
So, we can't use primitive Java type for Long here. However, it is still
possible to store uLong in 8 bytes, but have a special comparator for
unsigned types to avoid unwanted deserialization.

WDYT?






On Tue, Nov 24, 2020 at 2:04 PM Pavel Tupitsyn  wrote:

> Agree, let's get rid of "long, short, byte" in the protocol definition.
>
> We can use Rust style, which is concise and unambiguous:
> i8, u8, i16, u16, etc
>
> On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego  wrote:
>
> > Pavel,
> >
> > I totally support that. Also, if we are aiming for
> > stronger platform-independance,
> > in our schemas we may want to support bit-notation (int32, uint64)? For
> > example
> > "long" can mean a different type on different platforms and it's easy to
> > confuse
> > them (happens often when using ODBC for example).
> >
> > Best Regards,
> > Igor
> >
> >
> > On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Igniters,
> > >
> > > I think we should support unsigned data types:
> > > uByte, uShort, uInt, uLong
> > >
> > > Java does not have them, but many other languages do,
> > > and with the growing number of thin clients this is important.
> > >
> > > For example, in current Ignite.NET implementation we store unsigned
> > values
> > > as signed internally,
> > > but this is a huge pain when it comes to metadata, binary objects, etc.
> > > (it is easy to deserialize int as uint when you have a class, but not
> > with
> > > BinaryObject.GetField)
> > >
> > > Any objections?
> > >
> > > On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com> wrote:
> > >
> > > > Denis,
> > > >
> > > > Good point. Both serializers use reflection API.
> > > > However, we will allow users to configure static schema along with
> > > 'strict'
> > > > schema mode, we still need to validate user classes on client nodes
> > > against
> > > > the latest schema in the grid  and reflection API is the only way to
> do
> > > it.
> > > > One can find a few articles on the internet on how to enable
> reflection
> > > in
> > > > GraalVM.
> > > >
> > > > I'll create a task for supporting GraalVM, and maybe someone who is
> > > > familiar with GraalVM will suggest a solution or a proper workaround.
> > Or
> > > > I'll do it a bit later.
> > > > If no workaround is found, we could allow users to write it's own
> > > > serializer, but I don't think it is a good idea to expose any
> internal
> > > > classes to the public.
> > > >
> > > > On Tue, Nov 24, 2020 at 2:55 AM Denis Magda 
> wrote:
> > > >
> > > > > Andrey, thanks for the update,
> > > > >
> > > > > Does any of the serializers take into consideration the
> > > > > native-image-generation feature of GraalVM?
> > > > > https://www.graalvm.org/reference-manual/native-image/
> > > > >
> > > > > With the current binary marshaller, we can't even generate a native
> > > image
> > > > > for the code using our thin client APIs.
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Igniters,
> > > > > >
> > > > > > I'd like to continue discussion of IEP-54 (Schema-first
> approach).
> > > > > >
> > > > > > Hope everyone who is interested had a chance to get familiar with
> > the
> > > > > > proposal [1].
> > > > > > Please, do not hesitate to ask questions and share your ideas.
> > > > > >
> > > > > > I've prepared a prototype of serializer [2] for the data layout
> > > > described
> > > > > > in the proposal.
> > > > > > In prototy, I compared 2 approaches to (de)serialize objects, the
> > > first
> > > > > one
> > > > > > uses java reflection/unsafe API and similar to one we already use
> > in
> > > > > Ignite
> > > > > > and the second one generates serializer for particular user class
> > and
> > > > > uses
> > > > > > Janino library for compilation.
> > > > > > Second one shows better results in benchmarks.
> > > > > > I think 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
Agree, let's get rid of "long, short, byte" in the protocol definition.

We can use Rust style, which is concise and unambiguous:
i8, u8, i16, u16, etc

On Tue, Nov 24, 2020 at 1:58 PM Igor Sapego  wrote:

> Pavel,
>
> I totally support that. Also, if we are aiming for
> stronger platform-independance,
> in our schemas we may want to support bit-notation (int32, uint64)? For
> example
> "long" can mean a different type on different platforms and it's easy to
> confuse
> them (happens often when using ODBC for example).
>
> Best Regards,
> Igor
>
>
> On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn 
> wrote:
>
> > Igniters,
> >
> > I think we should support unsigned data types:
> > uByte, uShort, uInt, uLong
> >
> > Java does not have them, but many other languages do,
> > and with the growing number of thin clients this is important.
> >
> > For example, in current Ignite.NET implementation we store unsigned
> values
> > as signed internally,
> > but this is a huge pain when it comes to metadata, binary objects, etc.
> > (it is easy to deserialize int as uint when you have a class, but not
> with
> > BinaryObject.GetField)
> >
> > Any objections?
> >
> > On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
> > andrey.mashen...@gmail.com> wrote:
> >
> > > Denis,
> > >
> > > Good point. Both serializers use reflection API.
> > > However, we will allow users to configure static schema along with
> > 'strict'
> > > schema mode, we still need to validate user classes on client nodes
> > against
> > > the latest schema in the grid  and reflection API is the only way to do
> > it.
> > > One can find a few articles on the internet on how to enable reflection
> > in
> > > GraalVM.
> > >
> > > I'll create a task for supporting GraalVM, and maybe someone who is
> > > familiar with GraalVM will suggest a solution or a proper workaround.
> Or
> > > I'll do it a bit later.
> > > If no workaround is found, we could allow users to write it's own
> > > serializer, but I don't think it is a good idea to expose any internal
> > > classes to the public.
> > >
> > > On Tue, Nov 24, 2020 at 2:55 AM Denis Magda  wrote:
> > >
> > > > Andrey, thanks for the update,
> > > >
> > > > Does any of the serializers take into consideration the
> > > > native-image-generation feature of GraalVM?
> > > > https://www.graalvm.org/reference-manual/native-image/
> > > >
> > > > With the current binary marshaller, we can't even generate a native
> > image
> > > > for the code using our thin client APIs.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > > > andrey.mashen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Igniters,
> > > > >
> > > > > I'd like to continue discussion of IEP-54 (Schema-first approach).
> > > > >
> > > > > Hope everyone who is interested had a chance to get familiar with
> the
> > > > > proposal [1].
> > > > > Please, do not hesitate to ask questions and share your ideas.
> > > > >
> > > > > I've prepared a prototype of serializer [2] for the data layout
> > > described
> > > > > in the proposal.
> > > > > In prototy, I compared 2 approaches to (de)serialize objects, the
> > first
> > > > one
> > > > > uses java reflection/unsafe API and similar to one we already use
> in
> > > > Ignite
> > > > > and the second one generates serializer for particular user class
> and
> > > > uses
> > > > > Janino library for compilation.
> > > > > Second one shows better results in benchmarks.
> > > > > I think we can go with it as default serializer and have
> > > reflection-based
> > > > > implementation as a fallback if someone will have issues with the
> > first
> > > > > one.
> > > > > WDYT?
> > > > >
> > > > > There are a number of tasks under the umbrella ticket [3] waiting
> for
> > > the
> > > > > assignee.
> > > > >
> > > > > BTW, I'm going to create more tickets for schema manager modes
> > > > > implementation, but would like to clarify some details.
> > > > >
> > > > > I thought schemaManager on each node should held:
> > > > >   1. Local mapping of "schema version" <--> validated local
> key/value
> > > > > classes pair.
> > > > >   2. Cluster-wide schema changes history.
> > > > > On the client side. Before any key-value API operation we should
> > > > validate a
> > > > > schema for a given key-value pair.
> > > > > If there is no local-mapping exists for a given key-value pair or
> if
> > a
> > > > > cluster wide schema has a more recent version then the key-value
> pair
> > > > > should be validated against the latest version and local mapping
> > should
> > > > be
> > > > > updated/actualized.
> > > > > If an object doesn't fit to the latest schema then it depends on
> > schema
> > > > > mode: either fail the operation ('strict' mode) or a new mapping
> > should
> > > > be
> > > > > created and a new schema version should be propagated to the
> cluster.
> > > > >
> > > > > On the server side we usually have no key-value classes and we
> > operate
> > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Igor Sapego
Pavel,

I totally support that. Also, if we are aiming for
stronger platform-independance,
in our schemas we may want to support bit-notation (int32, uint64)? For
example
"long" can mean a different type on different platforms and it's easy to
confuse
them (happens often when using ODBC for example).

Best Regards,
Igor


On Tue, Nov 24, 2020 at 1:34 PM Pavel Tupitsyn  wrote:

> Igniters,
>
> I think we should support unsigned data types:
> uByte, uShort, uInt, uLong
>
> Java does not have them, but many other languages do,
> and with the growing number of thin clients this is important.
>
> For example, in current Ignite.NET implementation we store unsigned values
> as signed internally,
> but this is a huge pain when it comes to metadata, binary objects, etc.
> (it is easy to deserialize int as uint when you have a class, but not with
> BinaryObject.GetField)
>
> Any objections?
>
> On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
> andrey.mashen...@gmail.com> wrote:
>
> > Denis,
> >
> > Good point. Both serializers use reflection API.
> > However, we will allow users to configure static schema along with
> 'strict'
> > schema mode, we still need to validate user classes on client nodes
> against
> > the latest schema in the grid  and reflection API is the only way to do
> it.
> > One can find a few articles on the internet on how to enable reflection
> in
> > GraalVM.
> >
> > I'll create a task for supporting GraalVM, and maybe someone who is
> > familiar with GraalVM will suggest a solution or a proper workaround. Or
> > I'll do it a bit later.
> > If no workaround is found, we could allow users to write it's own
> > serializer, but I don't think it is a good idea to expose any internal
> > classes to the public.
> >
> > On Tue, Nov 24, 2020 at 2:55 AM Denis Magda  wrote:
> >
> > > Andrey, thanks for the update,
> > >
> > > Does any of the serializers take into consideration the
> > > native-image-generation feature of GraalVM?
> > > https://www.graalvm.org/reference-manual/native-image/
> > >
> > > With the current binary marshaller, we can't even generate a native
> image
> > > for the code using our thin client APIs.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > > andrey.mashen...@gmail.com>
> > > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > > I'd like to continue discussion of IEP-54 (Schema-first approach).
> > > >
> > > > Hope everyone who is interested had a chance to get familiar with the
> > > > proposal [1].
> > > > Please, do not hesitate to ask questions and share your ideas.
> > > >
> > > > I've prepared a prototype of serializer [2] for the data layout
> > described
> > > > in the proposal.
> > > > In prototy, I compared 2 approaches to (de)serialize objects, the
> first
> > > one
> > > > uses java reflection/unsafe API and similar to one we already use in
> > > Ignite
> > > > and the second one generates serializer for particular user class and
> > > uses
> > > > Janino library for compilation.
> > > > Second one shows better results in benchmarks.
> > > > I think we can go with it as default serializer and have
> > reflection-based
> > > > implementation as a fallback if someone will have issues with the
> first
> > > > one.
> > > > WDYT?
> > > >
> > > > There are a number of tasks under the umbrella ticket [3] waiting for
> > the
> > > > assignee.
> > > >
> > > > BTW, I'm going to create more tickets for schema manager modes
> > > > implementation, but would like to clarify some details.
> > > >
> > > > I thought schemaManager on each node should held:
> > > >   1. Local mapping of "schema version" <--> validated local key/value
> > > > classes pair.
> > > >   2. Cluster-wide schema changes history.
> > > > On the client side. Before any key-value API operation we should
> > > validate a
> > > > schema for a given key-value pair.
> > > > If there is no local-mapping exists for a given key-value pair or if
> a
> > > > cluster wide schema has a more recent version then the key-value pair
> > > > should be validated against the latest version and local mapping
> should
> > > be
> > > > updated/actualized.
> > > > If an object doesn't fit to the latest schema then it depends on
> schema
> > > > mode: either fail the operation ('strict' mode) or a new mapping
> should
> > > be
> > > > created and a new schema version should be propagated to the cluster.
> > > >
> > > > On the server side we usually have no key-value classes and we
> operate
> > > with
> > > > tuples.
> > > > As schema change history is available and a tuple has schema version,
> > > then
> > > > it is possible to upgrade any received tuple to the last version
> > without
> > > > desialization.
> > > > Thus we could allow nodes to send key-value pairs of previous
> versions
> > > (if
> > > > they didn't receive a schema update yet) without reverting schema
> > changes
> > > > made by a node with newer classes.
> > > >
> > > > Alex, Val, Ivan did you mean the same?
> 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Pavel Tupitsyn
Igniters,

I think we should support unsigned data types:
uByte, uShort, uInt, uLong

Java does not have them, but many other languages do,
and with the growing number of thin clients this is important.

For example, in current Ignite.NET implementation we store unsigned values
as signed internally,
but this is a huge pain when it comes to metadata, binary objects, etc.
(it is easy to deserialize int as uint when you have a class, but not with
BinaryObject.GetField)

Any objections?

On Tue, Nov 24, 2020 at 12:28 PM Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Denis,
>
> Good point. Both serializers use reflection API.
> However, we will allow users to configure static schema along with 'strict'
> schema mode, we still need to validate user classes on client nodes against
> the latest schema in the grid  and reflection API is the only way to do it.
> One can find a few articles on the internet on how to enable reflection in
> GraalVM.
>
> I'll create a task for supporting GraalVM, and maybe someone who is
> familiar with GraalVM will suggest a solution or a proper workaround. Or
> I'll do it a bit later.
> If no workaround is found, we could allow users to write it's own
> serializer, but I don't think it is a good idea to expose any internal
> classes to the public.
>
> On Tue, Nov 24, 2020 at 2:55 AM Denis Magda  wrote:
>
> > Andrey, thanks for the update,
> >
> > Does any of the serializers take into consideration the
> > native-image-generation feature of GraalVM?
> > https://www.graalvm.org/reference-manual/native-image/
> >
> > With the current binary marshaller, we can't even generate a native image
> > for the code using our thin client APIs.
> >
> > -
> > Denis
> >
> >
> > On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> > andrey.mashen...@gmail.com>
> > wrote:
> >
> > > Hi Igniters,
> > >
> > > I'd like to continue discussion of IEP-54 (Schema-first approach).
> > >
> > > Hope everyone who is interested had a chance to get familiar with the
> > > proposal [1].
> > > Please, do not hesitate to ask questions and share your ideas.
> > >
> > > I've prepared a prototype of serializer [2] for the data layout
> described
> > > in the proposal.
> > > In prototy, I compared 2 approaches to (de)serialize objects, the first
> > one
> > > uses java reflection/unsafe API and similar to one we already use in
> > Ignite
> > > and the second one generates serializer for particular user class and
> > uses
> > > Janino library for compilation.
> > > Second one shows better results in benchmarks.
> > > I think we can go with it as default serializer and have
> reflection-based
> > > implementation as a fallback if someone will have issues with the first
> > > one.
> > > WDYT?
> > >
> > > There are a number of tasks under the umbrella ticket [3] waiting for
> the
> > > assignee.
> > >
> > > BTW, I'm going to create more tickets for schema manager modes
> > > implementation, but would like to clarify some details.
> > >
> > > I thought schemaManager on each node should held:
> > >   1. Local mapping of "schema version" <--> validated local key/value
> > > classes pair.
> > >   2. Cluster-wide schema changes history.
> > > On the client side. Before any key-value API operation we should
> > validate a
> > > schema for a given key-value pair.
> > > If there is no local-mapping exists for a given key-value pair or if a
> > > cluster wide schema has a more recent version then the key-value pair
> > > should be validated against the latest version and local mapping should
> > be
> > > updated/actualized.
> > > If an object doesn't fit to the latest schema then it depends on schema
> > > mode: either fail the operation ('strict' mode) or a new mapping should
> > be
> > > created and a new schema version should be propagated to the cluster.
> > >
> > > On the server side we usually have no key-value classes and we operate
> > with
> > > tuples.
> > > As schema change history is available and a tuple has schema version,
> > then
> > > it is possible to upgrade any received tuple to the last version
> without
> > > desialization.
> > > Thus we could allow nodes to send key-value pairs of previous versions
> > (if
> > > they didn't receive a schema update yet) without reverting schema
> changes
> > > made by a node with newer classes.
> > >
> > > Alex, Val, Ivan did you mean the same?
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > > [2] https://github.com/apache/ignite/tree/ignite-13618/modules/commons
> > > [3] https://issues.apache.org/jira/browse/IGNITE-13616
> > >
> > > On Thu, Sep 17, 2020 at 9:21 AM Ivan Pavlukhin 
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > Please do not ignore history. We had a thread [1] with many bright
> > > > ideas. We can resume it.
> > > >
> > > > [1]
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html
> > > >
> > > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-24 Thread Andrey Mashenkov
Denis,

Good point. Both serializers use reflection API.
However, we will allow users to configure static schema along with 'strict'
schema mode, we still need to validate user classes on client nodes against
the latest schema in the grid  and reflection API is the only way to do it.
One can find a few articles on the internet on how to enable reflection in
GraalVM.

I'll create a task for supporting GraalVM, and maybe someone who is
familiar with GraalVM will suggest a solution or a proper workaround. Or
I'll do it a bit later.
If no workaround is found, we could allow users to write it's own
serializer, but I don't think it is a good idea to expose any internal
classes to the public.

On Tue, Nov 24, 2020 at 2:55 AM Denis Magda  wrote:

> Andrey, thanks for the update,
>
> Does any of the serializers take into consideration the
> native-image-generation feature of GraalVM?
> https://www.graalvm.org/reference-manual/native-image/
>
> With the current binary marshaller, we can't even generate a native image
> for the code using our thin client APIs.
>
> -
> Denis
>
>
> On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov <
> andrey.mashen...@gmail.com>
> wrote:
>
> > Hi Igniters,
> >
> > I'd like to continue discussion of IEP-54 (Schema-first approach).
> >
> > Hope everyone who is interested had a chance to get familiar with the
> > proposal [1].
> > Please, do not hesitate to ask questions and share your ideas.
> >
> > I've prepared a prototype of serializer [2] for the data layout described
> > in the proposal.
> > In prototy, I compared 2 approaches to (de)serialize objects, the first
> one
> > uses java reflection/unsafe API and similar to one we already use in
> Ignite
> > and the second one generates serializer for particular user class and
> uses
> > Janino library for compilation.
> > Second one shows better results in benchmarks.
> > I think we can go with it as default serializer and have reflection-based
> > implementation as a fallback if someone will have issues with the first
> > one.
> > WDYT?
> >
> > There are a number of tasks under the umbrella ticket [3] waiting for the
> > assignee.
> >
> > BTW, I'm going to create more tickets for schema manager modes
> > implementation, but would like to clarify some details.
> >
> > I thought schemaManager on each node should held:
> >   1. Local mapping of "schema version" <--> validated local key/value
> > classes pair.
> >   2. Cluster-wide schema changes history.
> > On the client side. Before any key-value API operation we should
> validate a
> > schema for a given key-value pair.
> > If there is no local-mapping exists for a given key-value pair or if a
> > cluster wide schema has a more recent version then the key-value pair
> > should be validated against the latest version and local mapping should
> be
> > updated/actualized.
> > If an object doesn't fit to the latest schema then it depends on schema
> > mode: either fail the operation ('strict' mode) or a new mapping should
> be
> > created and a new schema version should be propagated to the cluster.
> >
> > On the server side we usually have no key-value classes and we operate
> with
> > tuples.
> > As schema change history is available and a tuple has schema version,
> then
> > it is possible to upgrade any received tuple to the last version without
> > desialization.
> > Thus we could allow nodes to send key-value pairs of previous versions
> (if
> > they didn't receive a schema update yet) without reverting schema changes
> > made by a node with newer classes.
> >
> > Alex, Val, Ivan did you mean the same?
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > [2] https://github.com/apache/ignite/tree/ignite-13618/modules/commons
> > [3] https://issues.apache.org/jira/browse/IGNITE-13616
> >
> > On Thu, Sep 17, 2020 at 9:21 AM Ivan Pavlukhin 
> > wrote:
> >
> > > Folks,
> > >
> > > Please do not ignore history. We had a thread [1] with many bright
> > > ideas. We can resume it.
> > >
> > > [1]
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html
> > >
> > > 2020-09-10 0:08 GMT+03:00, Denis Magda :
> > > > Val, makes sense, thanks for explaining.
> > > >
> > > > Agree that we need to have a separate discussion thread for the
> "table"
> > > and
> > > > "cache" terms substitution. I'll appreciate it if you start the
> thread
> > > > sharing pointers to any relevant IEPs and reasoning behind the
> > suggested
> > > > change.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Tue, Sep 8, 2020 at 6:01 PM Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com> wrote:
> > > >
> > > >> Hi Denis,
> > > >>
> > > >> I guess the wording in the IEP is a little bit confusing. All it
> means
> > > is
> > > >> that you should not create nested POJOs, but rather inline fields
> > into a
> > > >> single POJO that is mapped to a particular schema. In other words,
> > 

Re: IEP-54: Schema-first approach for 3.0

2020-11-23 Thread Denis Magda
Andrey, thanks for the update,

Does any of the serializers take into consideration the
native-image-generation feature of GraalVM?
https://www.graalvm.org/reference-manual/native-image/

With the current binary marshaller, we can't even generate a native image
for the code using our thin client APIs.

-
Denis


On Mon, Nov 23, 2020 at 4:39 AM Andrey Mashenkov 
wrote:

> Hi Igniters,
>
> I'd like to continue discussion of IEP-54 (Schema-first approach).
>
> Hope everyone who is interested had a chance to get familiar with the
> proposal [1].
> Please, do not hesitate to ask questions and share your ideas.
>
> I've prepared a prototype of serializer [2] for the data layout described
> in the proposal.
> In prototy, I compared 2 approaches to (de)serialize objects, the first one
> uses java reflection/unsafe API and similar to one we already use in Ignite
> and the second one generates serializer for particular user class and uses
> Janino library for compilation.
> Second one shows better results in benchmarks.
> I think we can go with it as default serializer and have reflection-based
> implementation as a fallback if someone will have issues with the first
> one.
> WDYT?
>
> There are a number of tasks under the umbrella ticket [3] waiting for the
> assignee.
>
> BTW, I'm going to create more tickets for schema manager modes
> implementation, but would like to clarify some details.
>
> I thought schemaManager on each node should held:
>   1. Local mapping of "schema version" <--> validated local key/value
> classes pair.
>   2. Cluster-wide schema changes history.
> On the client side. Before any key-value API operation we should validate a
> schema for a given key-value pair.
> If there is no local-mapping exists for a given key-value pair or if a
> cluster wide schema has a more recent version then the key-value pair
> should be validated against the latest version and local mapping should be
> updated/actualized.
> If an object doesn't fit to the latest schema then it depends on schema
> mode: either fail the operation ('strict' mode) or a new mapping should be
> created and a new schema version should be propagated to the cluster.
>
> On the server side we usually have no key-value classes and we operate with
> tuples.
> As schema change history is available and a tuple has schema version, then
> it is possible to upgrade any received tuple to the last version without
> desialization.
> Thus we could allow nodes to send key-value pairs of previous versions (if
> they didn't receive a schema update yet) without reverting schema changes
> made by a node with newer classes.
>
> Alex, Val, Ivan did you mean the same?
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> [2] https://github.com/apache/ignite/tree/ignite-13618/modules/commons
> [3] https://issues.apache.org/jira/browse/IGNITE-13616
>
> On Thu, Sep 17, 2020 at 9:21 AM Ivan Pavlukhin 
> wrote:
>
> > Folks,
> >
> > Please do not ignore history. We had a thread [1] with many bright
> > ideas. We can resume it.
> >
> > [1]
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html
> >
> > 2020-09-10 0:08 GMT+03:00, Denis Magda :
> > > Val, makes sense, thanks for explaining.
> > >
> > > Agree that we need to have a separate discussion thread for the "table"
> > and
> > > "cache" terms substitution. I'll appreciate it if you start the thread
> > > sharing pointers to any relevant IEPs and reasoning behind the
> suggested
> > > change.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Tue, Sep 8, 2020 at 6:01 PM Valentin Kulichenko <
> > > valentin.kuliche...@gmail.com> wrote:
> > >
> > >> Hi Denis,
> > >>
> > >> I guess the wording in the IEP is a little bit confusing. All it means
> > is
> > >> that you should not create nested POJOs, but rather inline fields
> into a
> > >> single POJO that is mapped to a particular schema. In other words,
> > nested
> > >> POJOs are not supported.
> > >>
> > >> Alex, is this correct? Please let me know if I'm missing something.
> > >>
> > >> As for the "cache" term, I agree that it is outdated, but I'm not sure
> > >> what we can replace it with. "Table" is tightly associated with SQL,
> but
> > >> SQL is optional in our case. Do you want to create a separate
> discussion
> > >> about this?
> > >>
> > >> -Val
> > >>
> > >> On Tue, Sep 8, 2020 at 4:37 PM Denis Magda  wrote:
> > >>
> > >>> Val,
> > >>>
> > >>> I've checked the IEP again and have a few questions.
> > >>>
> > >>> Arbitrary nested objects and collections are not allowed as column
> > >>> values.
> > >>> > Nested POJOs should either be inlined into schema, or stored as
> BLOBs
> > >>>
> > >>>
> > >>> Could you provide a DDL code snippet showing how the inlining of
> POJOs
> > >>> is
> > >>> supposed to work?
> > >>>
> > >>> Also, we keep using the terms "cache" and "table" throughout the IEP.
> > Is
> > >>> it
> > >>> the right time to discuss an 

Re: IEP-54: Schema-first approach for 3.0

2020-11-23 Thread Andrey Mashenkov
Hi Igniters,

I'd like to continue discussion of IEP-54 (Schema-first approach).

Hope everyone who is interested had a chance to get familiar with the
proposal [1].
Please, do not hesitate to ask questions and share your ideas.

I've prepared a prototype of serializer [2] for the data layout described
in the proposal.
In prototy, I compared 2 approaches to (de)serialize objects, the first one
uses java reflection/unsafe API and similar to one we already use in Ignite
and the second one generates serializer for particular user class and uses
Janino library for compilation.
Second one shows better results in benchmarks.
I think we can go with it as default serializer and have reflection-based
implementation as a fallback if someone will have issues with the first one.
WDYT?

There are a number of tasks under the umbrella ticket [3] waiting for the
assignee.

BTW, I'm going to create more tickets for schema manager modes
implementation, but would like to clarify some details.

I thought schemaManager on each node should held:
  1. Local mapping of "schema version" <--> validated local key/value
classes pair.
  2. Cluster-wide schema changes history.
On the client side. Before any key-value API operation we should validate a
schema for a given key-value pair.
If there is no local-mapping exists for a given key-value pair or if a
cluster wide schema has a more recent version then the key-value pair
should be validated against the latest version and local mapping should be
updated/actualized.
If an object doesn't fit to the latest schema then it depends on schema
mode: either fail the operation ('strict' mode) or a new mapping should be
created and a new schema version should be propagated to the cluster.

On the server side we usually have no key-value classes and we operate with
tuples.
As schema change history is available and a tuple has schema version, then
it is possible to upgrade any received tuple to the last version without
desialization.
Thus we could allow nodes to send key-value pairs of previous versions (if
they didn't receive a schema update yet) without reverting schema changes
made by a node with newer classes.

Alex, Val, Ivan did you mean the same?


[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
[2] https://github.com/apache/ignite/tree/ignite-13618/modules/commons
[3] https://issues.apache.org/jira/browse/IGNITE-13616

On Thu, Sep 17, 2020 at 9:21 AM Ivan Pavlukhin  wrote:

> Folks,
>
> Please do not ignore history. We had a thread [1] with many bright
> ideas. We can resume it.
>
> [1]
> http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html
>
> 2020-09-10 0:08 GMT+03:00, Denis Magda :
> > Val, makes sense, thanks for explaining.
> >
> > Agree that we need to have a separate discussion thread for the "table"
> and
> > "cache" terms substitution. I'll appreciate it if you start the thread
> > sharing pointers to any relevant IEPs and reasoning behind the suggested
> > change.
> >
> > -
> > Denis
> >
> >
> > On Tue, Sep 8, 2020 at 6:01 PM Valentin Kulichenko <
> > valentin.kuliche...@gmail.com> wrote:
> >
> >> Hi Denis,
> >>
> >> I guess the wording in the IEP is a little bit confusing. All it means
> is
> >> that you should not create nested POJOs, but rather inline fields into a
> >> single POJO that is mapped to a particular schema. In other words,
> nested
> >> POJOs are not supported.
> >>
> >> Alex, is this correct? Please let me know if I'm missing something.
> >>
> >> As for the "cache" term, I agree that it is outdated, but I'm not sure
> >> what we can replace it with. "Table" is tightly associated with SQL, but
> >> SQL is optional in our case. Do you want to create a separate discussion
> >> about this?
> >>
> >> -Val
> >>
> >> On Tue, Sep 8, 2020 at 4:37 PM Denis Magda  wrote:
> >>
> >>> Val,
> >>>
> >>> I've checked the IEP again and have a few questions.
> >>>
> >>> Arbitrary nested objects and collections are not allowed as column
> >>> values.
> >>> > Nested POJOs should either be inlined into schema, or stored as BLOBs
> >>>
> >>>
> >>> Could you provide a DDL code snippet showing how the inlining of POJOs
> >>> is
> >>> supposed to work?
> >>>
> >>> Also, we keep using the terms "cache" and "table" throughout the IEP.
> Is
> >>> it
> >>> the right time to discuss an alternate name that would replace those
> >>> too?
> >>> Personally, the "table" should stay and the "cache" should go
> >>> considering
> >>> that SQL is one of the primary APIs in Ignite and that DDL is supported
> >>> out-of-the-box.
> >>>
> >>>
> >>> -
> >>> Denis
> >>>
> >>>
> >>> On Mon, Sep 7, 2020 at 12:26 PM Valentin Kulichenko <
> >>> valentin.kuliche...@gmail.com> wrote:
> >>>
> >>> > Ivan,
> >>> >
> >>> > I see your point. I agree that with the automatic updates we step
> into
> >>> the
> >>> > schema-last territory.
> >>> >
> >>> > Actually, if we support automatic evolution, we can as well support
> 

Re: IEP-54: Schema-first approach for 3.0

2020-09-17 Thread Ivan Pavlukhin
Folks,

Please do not ignore history. We had a thread [1] with many bright
ideas. We can resume it.

[1] 
http://apache-ignite-developers.2346864.n4.nabble.com/Applicability-of-term-cache-to-Apache-Ignite-td36541.html

2020-09-10 0:08 GMT+03:00, Denis Magda :
> Val, makes sense, thanks for explaining.
>
> Agree that we need to have a separate discussion thread for the "table" and
> "cache" terms substitution. I'll appreciate it if you start the thread
> sharing pointers to any relevant IEPs and reasoning behind the suggested
> change.
>
> -
> Denis
>
>
> On Tue, Sep 8, 2020 at 6:01 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
>> Hi Denis,
>>
>> I guess the wording in the IEP is a little bit confusing. All it means is
>> that you should not create nested POJOs, but rather inline fields into a
>> single POJO that is mapped to a particular schema. In other words, nested
>> POJOs are not supported.
>>
>> Alex, is this correct? Please let me know if I'm missing something.
>>
>> As for the "cache" term, I agree that it is outdated, but I'm not sure
>> what we can replace it with. "Table" is tightly associated with SQL, but
>> SQL is optional in our case. Do you want to create a separate discussion
>> about this?
>>
>> -Val
>>
>> On Tue, Sep 8, 2020 at 4:37 PM Denis Magda  wrote:
>>
>>> Val,
>>>
>>> I've checked the IEP again and have a few questions.
>>>
>>> Arbitrary nested objects and collections are not allowed as column
>>> values.
>>> > Nested POJOs should either be inlined into schema, or stored as BLOBs
>>>
>>>
>>> Could you provide a DDL code snippet showing how the inlining of POJOs
>>> is
>>> supposed to work?
>>>
>>> Also, we keep using the terms "cache" and "table" throughout the IEP. Is
>>> it
>>> the right time to discuss an alternate name that would replace those
>>> too?
>>> Personally, the "table" should stay and the "cache" should go
>>> considering
>>> that SQL is one of the primary APIs in Ignite and that DDL is supported
>>> out-of-the-box.
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Mon, Sep 7, 2020 at 12:26 PM Valentin Kulichenko <
>>> valentin.kuliche...@gmail.com> wrote:
>>>
>>> > Ivan,
>>> >
>>> > I see your point. I agree that with the automatic updates we step into
>>> the
>>> > schema-last territory.
>>> >
>>> > Actually, if we support automatic evolution, we can as well support
>>> > creating a cache without schema and inferring it from the first
>>> > insert.
>>> In
>>> > other words, we can have both "schema-first" and "schema-last" modes.
>>> >
>>> > Alexey, what do you think?
>>> >
>>> > -Val
>>> >
>>> > On Mon, Sep 7, 2020 at 5:59 AM Alexey Goncharuk <
>>> > alexey.goncha...@gmail.com>
>>> > wrote:
>>> >
>>> > > Ivan,
>>> > >
>>> > > Thank you, I got your concern now. As it is mostly regarding the
>>> > > terminology, I am absolutely fine with changing the name to whatever
>>> fits
>>> > > the approach best. Dynamic or evolving schema sounds great. I will
>>> make
>>> > > corresponding changes to the IEP once we settle on the name.
>>> > >
>>> > > пн, 7 сент. 2020 г. в 11:33, Ivan Pavlukhin :
>>> > >
>>> > > > Hi Val,
>>> > > >
>>> > > > Thank you for your answer!
>>> > > >
>>> > > > My understanding is a little bit different. Yes, schema evolution
>>> > > > definitely should be possible. But I see a main difference in "how
>>> > > > schema is updated". I treat a common SQL approach schema-first.
>>> Schema
>>> > > > and data manipulation operations are clearly separated and it
>>> enables
>>> > > > interesting capabilities, e.g. preventing untended schema changes
>>> > > > by
>>> > > > mistaken data operations, restricting user permissions to change
>>> > > > schema.
>>> > > >
>>> > > > > Schema-first means that schema exists in advance and all the
>>> stored
>>> > > data
>>> > > > is compliant with it - that's exactly what is proposed.
>>> > > >
>>> > > > A schema-last approach mentioned in [1] also assumes that schema
>>> > > > exists, but it is inferred from data. Is not it more similar to
>>> > > > the
>>> > > > proposing approach?
>>> > > >
>>> > > > And I would like to say, that my main concern so far is mostly
>>> > > > about
>>> > > > terminology. And I suppose if it confuses me then others might be
>>> > > > confused as well. My feeling is closer to "dynamic or liquid or
>>> > > > may
>>> be
>>> > > > evolving schema".
>>> > > >
>>> > > > [1]
>>> > > >
>>> > https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>>> > > >
>>> > > > 2020-09-07 0:47 GMT+03:00, Valentin Kulichenko <
>>> > > > valentin.kuliche...@gmail.com>:
>>> > > > > Hi Ivan,
>>> > > > >
>>> > > > > I don't see an issue with that. Schema-first means that schema
>>> exists
>>> > > in
>>> > > > > advance and all the stored data is compliant with it - that's
>>> exactly
>>> > > > what
>>> > > > > is proposed. There are no restrictions prohibiting changes to
>>> > > > > the
>>> > > schema.
>>> > > > >
>>> > > > > -Val
>>> > > > >
>>> > > > > On Sat, Sep 5, 2020 at 9:52 PM 

Re: IEP-54: Schema-first approach for 3.0

2020-09-09 Thread Denis Magda
Val, makes sense, thanks for explaining.

Agree that we need to have a separate discussion thread for the "table" and
"cache" terms substitution. I'll appreciate it if you start the thread
sharing pointers to any relevant IEPs and reasoning behind the suggested
change.

-
Denis


On Tue, Sep 8, 2020 at 6:01 PM Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Hi Denis,
>
> I guess the wording in the IEP is a little bit confusing. All it means is
> that you should not create nested POJOs, but rather inline fields into a
> single POJO that is mapped to a particular schema. In other words, nested
> POJOs are not supported.
>
> Alex, is this correct? Please let me know if I'm missing something.
>
> As for the "cache" term, I agree that it is outdated, but I'm not sure
> what we can replace it with. "Table" is tightly associated with SQL, but
> SQL is optional in our case. Do you want to create a separate discussion
> about this?
>
> -Val
>
> On Tue, Sep 8, 2020 at 4:37 PM Denis Magda  wrote:
>
>> Val,
>>
>> I've checked the IEP again and have a few questions.
>>
>> Arbitrary nested objects and collections are not allowed as column values.
>> > Nested POJOs should either be inlined into schema, or stored as BLOBs
>>
>>
>> Could you provide a DDL code snippet showing how the inlining of POJOs is
>> supposed to work?
>>
>> Also, we keep using the terms "cache" and "table" throughout the IEP. Is
>> it
>> the right time to discuss an alternate name that would replace those too?
>> Personally, the "table" should stay and the "cache" should go considering
>> that SQL is one of the primary APIs in Ignite and that DDL is supported
>> out-of-the-box.
>>
>>
>> -
>> Denis
>>
>>
>> On Mon, Sep 7, 2020 at 12:26 PM Valentin Kulichenko <
>> valentin.kuliche...@gmail.com> wrote:
>>
>> > Ivan,
>> >
>> > I see your point. I agree that with the automatic updates we step into
>> the
>> > schema-last territory.
>> >
>> > Actually, if we support automatic evolution, we can as well support
>> > creating a cache without schema and inferring it from the first insert.
>> In
>> > other words, we can have both "schema-first" and "schema-last" modes.
>> >
>> > Alexey, what do you think?
>> >
>> > -Val
>> >
>> > On Mon, Sep 7, 2020 at 5:59 AM Alexey Goncharuk <
>> > alexey.goncha...@gmail.com>
>> > wrote:
>> >
>> > > Ivan,
>> > >
>> > > Thank you, I got your concern now. As it is mostly regarding the
>> > > terminology, I am absolutely fine with changing the name to whatever
>> fits
>> > > the approach best. Dynamic or evolving schema sounds great. I will
>> make
>> > > corresponding changes to the IEP once we settle on the name.
>> > >
>> > > пн, 7 сент. 2020 г. в 11:33, Ivan Pavlukhin :
>> > >
>> > > > Hi Val,
>> > > >
>> > > > Thank you for your answer!
>> > > >
>> > > > My understanding is a little bit different. Yes, schema evolution
>> > > > definitely should be possible. But I see a main difference in "how
>> > > > schema is updated". I treat a common SQL approach schema-first.
>> Schema
>> > > > and data manipulation operations are clearly separated and it
>> enables
>> > > > interesting capabilities, e.g. preventing untended schema changes by
>> > > > mistaken data operations, restricting user permissions to change
>> > > > schema.
>> > > >
>> > > > > Schema-first means that schema exists in advance and all the
>> stored
>> > > data
>> > > > is compliant with it - that's exactly what is proposed.
>> > > >
>> > > > A schema-last approach mentioned in [1] also assumes that schema
>> > > > exists, but it is inferred from data. Is not it more similar to the
>> > > > proposing approach?
>> > > >
>> > > > And I would like to say, that my main concern so far is mostly about
>> > > > terminology. And I suppose if it confuses me then others might be
>> > > > confused as well. My feeling is closer to "dynamic or liquid or may
>> be
>> > > > evolving schema".
>> > > >
>> > > > [1]
>> > > >
>> > https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>> > > >
>> > > > 2020-09-07 0:47 GMT+03:00, Valentin Kulichenko <
>> > > > valentin.kuliche...@gmail.com>:
>> > > > > Hi Ivan,
>> > > > >
>> > > > > I don't see an issue with that. Schema-first means that schema
>> exists
>> > > in
>> > > > > advance and all the stored data is compliant with it - that's
>> exactly
>> > > > what
>> > > > > is proposed. There are no restrictions prohibiting changes to the
>> > > schema.
>> > > > >
>> > > > > -Val
>> > > > >
>> > > > > On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin <
>> vololo...@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > >> Alexey,
>> > > > >>
>> > > > >> I am a little bit confused with terminology. My understanding
>> > conforms
>> > > > >> to a survey [1] (see part X Semi Structured Data). Can we really
>> > treat
>> > > > >> a "dynamic schema" approach as a kind of "schema-first"?
>> > > > >>
>> > > > >> [1]
>> > > > >>
>> > >
>> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>> > > > >>

Re: IEP-54: Schema-first approach for 3.0

2020-09-08 Thread Valentin Kulichenko
Hi Denis,

I guess the wording in the IEP is a little bit confusing. All it means is
that you should not create nested POJOs, but rather inline fields into a
single POJO that is mapped to a particular schema. In other words, nested
POJOs are not supported.

Alex, is this correct? Please let me know if I'm missing something.

As for the "cache" term, I agree that it is outdated, but I'm not sure what
we can replace it with. "Table" is tightly associated with SQL, but SQL is
optional in our case. Do you want to create a separate discussion about
this?

-Val

On Tue, Sep 8, 2020 at 4:37 PM Denis Magda  wrote:

> Val,
>
> I've checked the IEP again and have a few questions.
>
> Arbitrary nested objects and collections are not allowed as column values.
> > Nested POJOs should either be inlined into schema, or stored as BLOBs
>
>
> Could you provide a DDL code snippet showing how the inlining of POJOs is
> supposed to work?
>
> Also, we keep using the terms "cache" and "table" throughout the IEP. Is it
> the right time to discuss an alternate name that would replace those too?
> Personally, the "table" should stay and the "cache" should go considering
> that SQL is one of the primary APIs in Ignite and that DDL is supported
> out-of-the-box.
>
>
> -
> Denis
>
>
> On Mon, Sep 7, 2020 at 12:26 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Ivan,
> >
> > I see your point. I agree that with the automatic updates we step into
> the
> > schema-last territory.
> >
> > Actually, if we support automatic evolution, we can as well support
> > creating a cache without schema and inferring it from the first insert.
> In
> > other words, we can have both "schema-first" and "schema-last" modes.
> >
> > Alexey, what do you think?
> >
> > -Val
> >
> > On Mon, Sep 7, 2020 at 5:59 AM Alexey Goncharuk <
> > alexey.goncha...@gmail.com>
> > wrote:
> >
> > > Ivan,
> > >
> > > Thank you, I got your concern now. As it is mostly regarding the
> > > terminology, I am absolutely fine with changing the name to whatever
> fits
> > > the approach best. Dynamic or evolving schema sounds great. I will make
> > > corresponding changes to the IEP once we settle on the name.
> > >
> > > пн, 7 сент. 2020 г. в 11:33, Ivan Pavlukhin :
> > >
> > > > Hi Val,
> > > >
> > > > Thank you for your answer!
> > > >
> > > > My understanding is a little bit different. Yes, schema evolution
> > > > definitely should be possible. But I see a main difference in "how
> > > > schema is updated". I treat a common SQL approach schema-first.
> Schema
> > > > and data manipulation operations are clearly separated and it enables
> > > > interesting capabilities, e.g. preventing untended schema changes by
> > > > mistaken data operations, restricting user permissions to change
> > > > schema.
> > > >
> > > > > Schema-first means that schema exists in advance and all the stored
> > > data
> > > > is compliant with it - that's exactly what is proposed.
> > > >
> > > > A schema-last approach mentioned in [1] also assumes that schema
> > > > exists, but it is inferred from data. Is not it more similar to the
> > > > proposing approach?
> > > >
> > > > And I would like to say, that my main concern so far is mostly about
> > > > terminology. And I suppose if it confuses me then others might be
> > > > confused as well. My feeling is closer to "dynamic or liquid or may
> be
> > > > evolving schema".
> > > >
> > > > [1]
> > > >
> > https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
> > > >
> > > > 2020-09-07 0:47 GMT+03:00, Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com>:
> > > > > Hi Ivan,
> > > > >
> > > > > I don't see an issue with that. Schema-first means that schema
> exists
> > > in
> > > > > advance and all the stored data is compliant with it - that's
> exactly
> > > > what
> > > > > is proposed. There are no restrictions prohibiting changes to the
> > > schema.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin  >
> > > > wrote:
> > > > >
> > > > >> Alexey,
> > > > >>
> > > > >> I am a little bit confused with terminology. My understanding
> > conforms
> > > > >> to a survey [1] (see part X Semi Structured Data). Can we really
> > treat
> > > > >> a "dynamic schema" approach as a kind of "schema-first"?
> > > > >>
> > > > >> [1]
> > > > >>
> > >
> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
> > > > >>
> > > > >> 2020-09-02 1:53 GMT+03:00, Denis Magda :
> > > > >> >>
> > > > >> >> However, could you please elaborate on the relation between
> > Ignite
> > > > and
> > > > >> >> ORM?
> > > > >> >> Is there a use case for Hibernate running on top of Ignite (I
> > > haven't
> > > > >> >> seen
> > > > >> >> one so far)? If so, what is missing exactly on the Ignite side
> to
> > > > >> support
> > > > >> >> this? In my understanding, all you need is SQL API which we
> > already
> > > > >> have.
> > > > >> >> Am I missing something?
> > > > >> >
> > > > 

Re: IEP-54: Schema-first approach for 3.0

2020-09-07 Thread Valentin Kulichenko
Ivan,

I see your point. I agree that with the automatic updates we step into the
schema-last territory.

Actually, if we support automatic evolution, we can as well support
creating a cache without schema and inferring it from the first insert. In
other words, we can have both "schema-first" and "schema-last" modes.

Alexey, what do you think?

-Val

On Mon, Sep 7, 2020 at 5:59 AM Alexey Goncharuk 
wrote:

> Ivan,
>
> Thank you, I got your concern now. As it is mostly regarding the
> terminology, I am absolutely fine with changing the name to whatever fits
> the approach best. Dynamic or evolving schema sounds great. I will make
> corresponding changes to the IEP once we settle on the name.
>
> пн, 7 сент. 2020 г. в 11:33, Ivan Pavlukhin :
>
> > Hi Val,
> >
> > Thank you for your answer!
> >
> > My understanding is a little bit different. Yes, schema evolution
> > definitely should be possible. But I see a main difference in "how
> > schema is updated". I treat a common SQL approach schema-first. Schema
> > and data manipulation operations are clearly separated and it enables
> > interesting capabilities, e.g. preventing untended schema changes by
> > mistaken data operations, restricting user permissions to change
> > schema.
> >
> > > Schema-first means that schema exists in advance and all the stored
> data
> > is compliant with it - that's exactly what is proposed.
> >
> > A schema-last approach mentioned in [1] also assumes that schema
> > exists, but it is inferred from data. Is not it more similar to the
> > proposing approach?
> >
> > And I would like to say, that my main concern so far is mostly about
> > terminology. And I suppose if it confuses me then others might be
> > confused as well. My feeling is closer to "dynamic or liquid or may be
> > evolving schema".
> >
> > [1]
> > https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
> >
> > 2020-09-07 0:47 GMT+03:00, Valentin Kulichenko <
> > valentin.kuliche...@gmail.com>:
> > > Hi Ivan,
> > >
> > > I don't see an issue with that. Schema-first means that schema exists
> in
> > > advance and all the stored data is compliant with it - that's exactly
> > what
> > > is proposed. There are no restrictions prohibiting changes to the
> schema.
> > >
> > > -Val
> > >
> > > On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin 
> > wrote:
> > >
> > >> Alexey,
> > >>
> > >> I am a little bit confused with terminology. My understanding conforms
> > >> to a survey [1] (see part X Semi Structured Data). Can we really treat
> > >> a "dynamic schema" approach as a kind of "schema-first"?
> > >>
> > >> [1]
> > >>
> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
> > >>
> > >> 2020-09-02 1:53 GMT+03:00, Denis Magda :
> > >> >>
> > >> >> However, could you please elaborate on the relation between Ignite
> > and
> > >> >> ORM?
> > >> >> Is there a use case for Hibernate running on top of Ignite (I
> haven't
> > >> >> seen
> > >> >> one so far)? If so, what is missing exactly on the Ignite side to
> > >> support
> > >> >> this? In my understanding, all you need is SQL API which we already
> > >> have.
> > >> >> Am I missing something?
> > >> >
> > >> >
> > >> > Good point, yes, if all the ORM integrations use Ignite SQL APIs
> > >> > internally, then they can easily translate an Entity object into an
> > >> > INSERT/UPDATE statement that lists all the object's fields. Luckily,
> > >> > our
> > >> > Spring Data integration is already based on the Ignite SQL APIs and
> > >> > needs
> > >> > to be improved once the schema-first approach is supported. That
> would
> > >> > solve a ton of usability issues.
> > >> >
> > >> > I would revise the Hibernate integration as well during the Ignite
> 3.0
> > >> dev
> > >> > phase. Can't say if it's used a lot but Spring Data is getting
> > traction
> > >> for
> > >> > sure.
> > >> >
> > >> > @Michael Pollind, I'll loop you in as long as you've started working
> > on
> > >> the
> > >> > Ignite support for Micornaut Data
> > >> > 
> > and
> > >> > came across some challenges. Just watch this discussion. That's what
> > is
> > >> > coming in Ignite 3.0.
> > >> >
> > >> >
> > >> > -
> > >> > Denis
> > >> >
> > >> >
> > >> > On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
> > >> > valentin.kuliche...@gmail.com> wrote:
> > >> >
> > >> >> Hi Denis,
> > >> >>
> > >> >> Generally speaking, I believe that the schema-first approach
> natively
> > >> >> addresses the issue if duplicate fields in key and value objects,
> > >> because
> > >> >> schema will be created for a cache, not for an object, as it
> happens
> > >> now.
> > >> >> Basically, the schema will define whether there is a primary key or
> > >> >> not,
> > >> >> and which fields are included in case there is one. Any API that we
> > >> would
> > >> >> have must be compliant with this, so it becomes fairly easy to work
> > >> >> with
> > >> >> data as with a set of 

Re: IEP-54: Schema-first approach for 3.0

2020-09-07 Thread Alexey Goncharuk
Ivan,

Thank you, I got your concern now. As it is mostly regarding the
terminology, I am absolutely fine with changing the name to whatever fits
the approach best. Dynamic or evolving schema sounds great. I will make
corresponding changes to the IEP once we settle on the name.

пн, 7 сент. 2020 г. в 11:33, Ivan Pavlukhin :

> Hi Val,
>
> Thank you for your answer!
>
> My understanding is a little bit different. Yes, schema evolution
> definitely should be possible. But I see a main difference in "how
> schema is updated". I treat a common SQL approach schema-first. Schema
> and data manipulation operations are clearly separated and it enables
> interesting capabilities, e.g. preventing untended schema changes by
> mistaken data operations, restricting user permissions to change
> schema.
>
> > Schema-first means that schema exists in advance and all the stored data
> is compliant with it - that's exactly what is proposed.
>
> A schema-last approach mentioned in [1] also assumes that schema
> exists, but it is inferred from data. Is not it more similar to the
> proposing approach?
>
> And I would like to say, that my main concern so far is mostly about
> terminology. And I suppose if it confuses me then others might be
> confused as well. My feeling is closer to "dynamic or liquid or may be
> evolving schema".
>
> [1]
> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>
> 2020-09-07 0:47 GMT+03:00, Valentin Kulichenko <
> valentin.kuliche...@gmail.com>:
> > Hi Ivan,
> >
> > I don't see an issue with that. Schema-first means that schema exists in
> > advance and all the stored data is compliant with it - that's exactly
> what
> > is proposed. There are no restrictions prohibiting changes to the schema.
> >
> > -Val
> >
> > On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin 
> wrote:
> >
> >> Alexey,
> >>
> >> I am a little bit confused with terminology. My understanding conforms
> >> to a survey [1] (see part X Semi Structured Data). Can we really treat
> >> a "dynamic schema" approach as a kind of "schema-first"?
> >>
> >> [1]
> >> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
> >>
> >> 2020-09-02 1:53 GMT+03:00, Denis Magda :
> >> >>
> >> >> However, could you please elaborate on the relation between Ignite
> and
> >> >> ORM?
> >> >> Is there a use case for Hibernate running on top of Ignite (I haven't
> >> >> seen
> >> >> one so far)? If so, what is missing exactly on the Ignite side to
> >> support
> >> >> this? In my understanding, all you need is SQL API which we already
> >> have.
> >> >> Am I missing something?
> >> >
> >> >
> >> > Good point, yes, if all the ORM integrations use Ignite SQL APIs
> >> > internally, then they can easily translate an Entity object into an
> >> > INSERT/UPDATE statement that lists all the object's fields. Luckily,
> >> > our
> >> > Spring Data integration is already based on the Ignite SQL APIs and
> >> > needs
> >> > to be improved once the schema-first approach is supported. That would
> >> > solve a ton of usability issues.
> >> >
> >> > I would revise the Hibernate integration as well during the Ignite 3.0
> >> dev
> >> > phase. Can't say if it's used a lot but Spring Data is getting
> traction
> >> for
> >> > sure.
> >> >
> >> > @Michael Pollind, I'll loop you in as long as you've started working
> on
> >> the
> >> > Ignite support for Micornaut Data
> >> > 
> and
> >> > came across some challenges. Just watch this discussion. That's what
> is
> >> > coming in Ignite 3.0.
> >> >
> >> >
> >> > -
> >> > Denis
> >> >
> >> >
> >> > On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
> >> > valentin.kuliche...@gmail.com> wrote:
> >> >
> >> >> Hi Denis,
> >> >>
> >> >> Generally speaking, I believe that the schema-first approach natively
> >> >> addresses the issue if duplicate fields in key and value objects,
> >> because
> >> >> schema will be created for a cache, not for an object, as it happens
> >> now.
> >> >> Basically, the schema will define whether there is a primary key or
> >> >> not,
> >> >> and which fields are included in case there is one. Any API that we
> >> would
> >> >> have must be compliant with this, so it becomes fairly easy to work
> >> >> with
> >> >> data as with a set of records, rather than key-value pairs.
> >> >>
> >> >> However, could you please elaborate on the relation between Ignite
> and
> >> >> ORM?
> >> >> Is there a use case for Hibernate running on top of Ignite (I haven't
> >> >> seen
> >> >> one so far)? If so, what is missing exactly on the Ignite side to
> >> support
> >> >> this? In my understanding, all you need is SQL API which we already
> >> have.
> >> >> Am I missing something?
> >> >>
> >> >> -Val
> >> >>
> >> >> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda 
> wrote:
> >> >>
> >> >> > Val,
> >> >> >
> >> >> > I would propose adding another point to the motivations list which
> >> >> > is
> >> >> > related to 

Re: IEP-54: Schema-first approach for 3.0

2020-09-07 Thread Ivan Pavlukhin
Hi Val,

Thank you for your answer!

My understanding is a little bit different. Yes, schema evolution
definitely should be possible. But I see a main difference in "how
schema is updated". I treat a common SQL approach schema-first. Schema
and data manipulation operations are clearly separated and it enables
interesting capabilities, e.g. preventing untended schema changes by
mistaken data operations, restricting user permissions to change
schema.

> Schema-first means that schema exists in advance and all the stored data is 
> compliant with it - that's exactly what is proposed.

A schema-last approach mentioned in [1] also assumes that schema
exists, but it is inferred from data. Is not it more similar to the
proposing approach?

And I would like to say, that my main concern so far is mostly about
terminology. And I suppose if it confuses me then others might be
confused as well. My feeling is closer to "dynamic or liquid or may be
evolving schema".

[1] https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf

2020-09-07 0:47 GMT+03:00, Valentin Kulichenko :
> Hi Ivan,
>
> I don't see an issue with that. Schema-first means that schema exists in
> advance and all the stored data is compliant with it - that's exactly what
> is proposed. There are no restrictions prohibiting changes to the schema.
>
> -Val
>
> On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin  wrote:
>
>> Alexey,
>>
>> I am a little bit confused with terminology. My understanding conforms
>> to a survey [1] (see part X Semi Structured Data). Can we really treat
>> a "dynamic schema" approach as a kind of "schema-first"?
>>
>> [1]
>> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>>
>> 2020-09-02 1:53 GMT+03:00, Denis Magda :
>> >>
>> >> However, could you please elaborate on the relation between Ignite and
>> >> ORM?
>> >> Is there a use case for Hibernate running on top of Ignite (I haven't
>> >> seen
>> >> one so far)? If so, what is missing exactly on the Ignite side to
>> support
>> >> this? In my understanding, all you need is SQL API which we already
>> have.
>> >> Am I missing something?
>> >
>> >
>> > Good point, yes, if all the ORM integrations use Ignite SQL APIs
>> > internally, then they can easily translate an Entity object into an
>> > INSERT/UPDATE statement that lists all the object's fields. Luckily,
>> > our
>> > Spring Data integration is already based on the Ignite SQL APIs and
>> > needs
>> > to be improved once the schema-first approach is supported. That would
>> > solve a ton of usability issues.
>> >
>> > I would revise the Hibernate integration as well during the Ignite 3.0
>> dev
>> > phase. Can't say if it's used a lot but Spring Data is getting traction
>> for
>> > sure.
>> >
>> > @Michael Pollind, I'll loop you in as long as you've started working on
>> the
>> > Ignite support for Micornaut Data
>> >  and
>> > came across some challenges. Just watch this discussion. That's what is
>> > coming in Ignite 3.0.
>> >
>> >
>> > -
>> > Denis
>> >
>> >
>> > On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
>> > valentin.kuliche...@gmail.com> wrote:
>> >
>> >> Hi Denis,
>> >>
>> >> Generally speaking, I believe that the schema-first approach natively
>> >> addresses the issue if duplicate fields in key and value objects,
>> because
>> >> schema will be created for a cache, not for an object, as it happens
>> now.
>> >> Basically, the schema will define whether there is a primary key or
>> >> not,
>> >> and which fields are included in case there is one. Any API that we
>> would
>> >> have must be compliant with this, so it becomes fairly easy to work
>> >> with
>> >> data as with a set of records, rather than key-value pairs.
>> >>
>> >> However, could you please elaborate on the relation between Ignite and
>> >> ORM?
>> >> Is there a use case for Hibernate running on top of Ignite (I haven't
>> >> seen
>> >> one so far)? If so, what is missing exactly on the Ignite side to
>> support
>> >> this? In my understanding, all you need is SQL API which we already
>> have.
>> >> Am I missing something?
>> >>
>> >> -Val
>> >>
>> >> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
>> >>
>> >> > Val,
>> >> >
>> >> > I would propose adding another point to the motivations list which
>> >> > is
>> >> > related to the ORM frameworks such as Spring Data, Hibernate,
>> Micronaut
>> >> and
>> >> > many others.
>> >> >
>> >> > Presently, the storage engine requires to distinguish key objects
>> >> > from
>> >> the
>> >> > value ones that complicate the usage of Ignite with those ORM
>> >> > frameworks
>> >> > (especially if a key object comprises several fields). More on this
>> can
>> >> be
>> >> > found here:
>> >> >
>> >> >
>> >>
>> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
>> >> >
>> >> > It will be nice if the new schema-first approach 

Re: IEP-54: Schema-first approach for 3.0

2020-09-06 Thread Valentin Kulichenko
Hi Ivan,

I don't see an issue with that. Schema-first means that schema exists in
advance and all the stored data is compliant with it - that's exactly what
is proposed. There are no restrictions prohibiting changes to the schema.

-Val

On Sat, Sep 5, 2020 at 9:52 PM Ivan Pavlukhin  wrote:

> Alexey,
>
> I am a little bit confused with terminology. My understanding conforms
> to a survey [1] (see part X Semi Structured Data). Can we really treat
> a "dynamic schema" approach as a kind of "schema-first"?
>
> [1]
> https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf
>
> 2020-09-02 1:53 GMT+03:00, Denis Magda :
> >>
> >> However, could you please elaborate on the relation between Ignite and
> >> ORM?
> >> Is there a use case for Hibernate running on top of Ignite (I haven't
> >> seen
> >> one so far)? If so, what is missing exactly on the Ignite side to
> support
> >> this? In my understanding, all you need is SQL API which we already
> have.
> >> Am I missing something?
> >
> >
> > Good point, yes, if all the ORM integrations use Ignite SQL APIs
> > internally, then they can easily translate an Entity object into an
> > INSERT/UPDATE statement that lists all the object's fields. Luckily, our
> > Spring Data integration is already based on the Ignite SQL APIs and needs
> > to be improved once the schema-first approach is supported. That would
> > solve a ton of usability issues.
> >
> > I would revise the Hibernate integration as well during the Ignite 3.0
> dev
> > phase. Can't say if it's used a lot but Spring Data is getting traction
> for
> > sure.
> >
> > @Michael Pollind, I'll loop you in as long as you've started working on
> the
> > Ignite support for Micornaut Data
> >  and
> > came across some challenges. Just watch this discussion. That's what is
> > coming in Ignite 3.0.
> >
> >
> > -
> > Denis
> >
> >
> > On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
> > valentin.kuliche...@gmail.com> wrote:
> >
> >> Hi Denis,
> >>
> >> Generally speaking, I believe that the schema-first approach natively
> >> addresses the issue if duplicate fields in key and value objects,
> because
> >> schema will be created for a cache, not for an object, as it happens
> now.
> >> Basically, the schema will define whether there is a primary key or not,
> >> and which fields are included in case there is one. Any API that we
> would
> >> have must be compliant with this, so it becomes fairly easy to work with
> >> data as with a set of records, rather than key-value pairs.
> >>
> >> However, could you please elaborate on the relation between Ignite and
> >> ORM?
> >> Is there a use case for Hibernate running on top of Ignite (I haven't
> >> seen
> >> one so far)? If so, what is missing exactly on the Ignite side to
> support
> >> this? In my understanding, all you need is SQL API which we already
> have.
> >> Am I missing something?
> >>
> >> -Val
> >>
> >> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
> >>
> >> > Val,
> >> >
> >> > I would propose adding another point to the motivations list which is
> >> > related to the ORM frameworks such as Spring Data, Hibernate,
> Micronaut
> >> and
> >> > many others.
> >> >
> >> > Presently, the storage engine requires to distinguish key objects from
> >> the
> >> > value ones that complicate the usage of Ignite with those ORM
> >> > frameworks
> >> > (especially if a key object comprises several fields). More on this
> can
> >> be
> >> > found here:
> >> >
> >> >
> >>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
> >> >
> >> > It will be nice if the new schema-first approach allows us to work
> with
> >> > a
> >> > single entity object when it comes to the ORMs. With no need to split
> >> > the
> >> > entity into a key and value. Just want to be sure that the Ignite 3.0
> >> > has
> >> > all the essential public APIs that would support the single-entity
> >> > based
> >> > approach.
> >> >
> >> > What do you think?
> >> >
> >> > -
> >> > Denis
> >> >
> >> >
> >> > On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
> >> > valentin.kuliche...@gmail.com> wrote:
> >> >
> >> > > Igniters,
> >> > >
> >> > > One of the big changes proposed for Ignite 3.0 is the so-called
> >> > > "schema-first approach". To add more clarity, I've started writing
> >> > > the
> >> > IEP
> >> > > for this change:
> >> > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >> > >
> >> > > Please take a look and let me know if there are any immediate
> >> > > thoughts,
> >> > > suggestions, or objections.
> >> > >
> >> > > -Val
> >> > >
> >> >
> >>
> >
>
>
> --
>
> Best regards,
> Ivan Pavlukhin
>


Re: IEP-54: Schema-first approach for 3.0

2020-09-05 Thread Ivan Pavlukhin
Alexey,

I am a little bit confused with terminology. My understanding conforms
to a survey [1] (see part X Semi Structured Data). Can we really treat
a "dynamic schema" approach as a kind of "schema-first"?

[1] https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/papers/SH05.pdf

2020-09-02 1:53 GMT+03:00, Denis Magda :
>>
>> However, could you please elaborate on the relation between Ignite and
>> ORM?
>> Is there a use case for Hibernate running on top of Ignite (I haven't
>> seen
>> one so far)? If so, what is missing exactly on the Ignite side to support
>> this? In my understanding, all you need is SQL API which we already have.
>> Am I missing something?
>
>
> Good point, yes, if all the ORM integrations use Ignite SQL APIs
> internally, then they can easily translate an Entity object into an
> INSERT/UPDATE statement that lists all the object's fields. Luckily, our
> Spring Data integration is already based on the Ignite SQL APIs and needs
> to be improved once the schema-first approach is supported. That would
> solve a ton of usability issues.
>
> I would revise the Hibernate integration as well during the Ignite 3.0 dev
> phase. Can't say if it's used a lot but Spring Data is getting traction for
> sure.
>
> @Michael Pollind, I'll loop you in as long as you've started working on the
> Ignite support for Micornaut Data
>  and
> came across some challenges. Just watch this discussion. That's what is
> coming in Ignite 3.0.
>
>
> -
> Denis
>
>
> On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
>> Hi Denis,
>>
>> Generally speaking, I believe that the schema-first approach natively
>> addresses the issue if duplicate fields in key and value objects, because
>> schema will be created for a cache, not for an object, as it happens now.
>> Basically, the schema will define whether there is a primary key or not,
>> and which fields are included in case there is one. Any API that we would
>> have must be compliant with this, so it becomes fairly easy to work with
>> data as with a set of records, rather than key-value pairs.
>>
>> However, could you please elaborate on the relation between Ignite and
>> ORM?
>> Is there a use case for Hibernate running on top of Ignite (I haven't
>> seen
>> one so far)? If so, what is missing exactly on the Ignite side to support
>> this? In my understanding, all you need is SQL API which we already have.
>> Am I missing something?
>>
>> -Val
>>
>> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
>>
>> > Val,
>> >
>> > I would propose adding another point to the motivations list which is
>> > related to the ORM frameworks such as Spring Data, Hibernate, Micronaut
>> and
>> > many others.
>> >
>> > Presently, the storage engine requires to distinguish key objects from
>> the
>> > value ones that complicate the usage of Ignite with those ORM
>> > frameworks
>> > (especially if a key object comprises several fields). More on this can
>> be
>> > found here:
>> >
>> >
>> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
>> >
>> > It will be nice if the new schema-first approach allows us to work with
>> > a
>> > single entity object when it comes to the ORMs. With no need to split
>> > the
>> > entity into a key and value. Just want to be sure that the Ignite 3.0
>> > has
>> > all the essential public APIs that would support the single-entity
>> > based
>> > approach.
>> >
>> > What do you think?
>> >
>> > -
>> > Denis
>> >
>> >
>> > On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
>> > valentin.kuliche...@gmail.com> wrote:
>> >
>> > > Igniters,
>> > >
>> > > One of the big changes proposed for Ignite 3.0 is the so-called
>> > > "schema-first approach". To add more clarity, I've started writing
>> > > the
>> > IEP
>> > > for this change:
>> > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>> > >
>> > > Please take a look and let me know if there are any immediate
>> > > thoughts,
>> > > suggestions, or objections.
>> > >
>> > > -Val
>> > >
>> >
>>
>


-- 

Best regards,
Ivan Pavlukhin


Re: IEP-54: Schema-first approach for 3.0

2020-09-01 Thread Denis Magda
>
> However, could you please elaborate on the relation between Ignite and ORM?
> Is there a use case for Hibernate running on top of Ignite (I haven't seen
> one so far)? If so, what is missing exactly on the Ignite side to support
> this? In my understanding, all you need is SQL API which we already have.
> Am I missing something?


Good point, yes, if all the ORM integrations use Ignite SQL APIs
internally, then they can easily translate an Entity object into an
INSERT/UPDATE statement that lists all the object's fields. Luckily, our
Spring Data integration is already based on the Ignite SQL APIs and needs
to be improved once the schema-first approach is supported. That would
solve a ton of usability issues.

I would revise the Hibernate integration as well during the Ignite 3.0 dev
phase. Can't say if it's used a lot but Spring Data is getting traction for
sure.

@Michael Pollind, I'll loop you in as long as you've started working on the
Ignite support for Micornaut Data
 and
came across some challenges. Just watch this discussion. That's what is
coming in Ignite 3.0.


-
Denis


On Mon, Aug 31, 2020 at 5:11 PM Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Hi Denis,
>
> Generally speaking, I believe that the schema-first approach natively
> addresses the issue if duplicate fields in key and value objects, because
> schema will be created for a cache, not for an object, as it happens now.
> Basically, the schema will define whether there is a primary key or not,
> and which fields are included in case there is one. Any API that we would
> have must be compliant with this, so it becomes fairly easy to work with
> data as with a set of records, rather than key-value pairs.
>
> However, could you please elaborate on the relation between Ignite and ORM?
> Is there a use case for Hibernate running on top of Ignite (I haven't seen
> one so far)? If so, what is missing exactly on the Ignite side to support
> this? In my understanding, all you need is SQL API which we already have.
> Am I missing something?
>
> -Val
>
> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
>
> > Val,
> >
> > I would propose adding another point to the motivations list which is
> > related to the ORM frameworks such as Spring Data, Hibernate, Micronaut
> and
> > many others.
> >
> > Presently, the storage engine requires to distinguish key objects from
> the
> > value ones that complicate the usage of Ignite with those ORM frameworks
> > (especially if a key object comprises several fields). More on this can
> be
> > found here:
> >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
> >
> > It will be nice if the new schema-first approach allows us to work with a
> > single entity object when it comes to the ORMs. With no need to split the
> > entity into a key and value. Just want to be sure that the Ignite 3.0 has
> > all the essential public APIs that would support the single-entity based
> > approach.
> >
> > What do you think?
> >
> > -
> > Denis
> >
> >
> > On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
> > valentin.kuliche...@gmail.com> wrote:
> >
> > > Igniters,
> > >
> > > One of the big changes proposed for Ignite 3.0 is the so-called
> > > "schema-first approach". To add more clarity, I've started writing the
> > IEP
> > > for this change:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > >
> > > Please take a look and let me know if there are any immediate thoughts,
> > > suggestions, or objections.
> > >
> > > -Val
> > >
> >
>


Re: IEP-54: Schema-first approach for 3.0

2020-09-01 Thread Valentin Kulichenko
Alexey,

Thanks for adding more details to the IEP. I have a question regarding the
following:

*When an object is inserted into a table, we attempt to 'fit' object fields
to the schema columns. If a Java object has some extra fields which are not
present in the current schema, the schema is automatically updated to store
additional extra fields that are present in the object.*

Do you see this happening automatically? If so, it probably should be
optional, and even disabled by default. What do you think?

-Val

On Tue, Sep 1, 2020 at 10:44 AM Alexey Goncharuk 
wrote:

> Ivan,
>
> Thank you for reminding me about the dynamic schema. I've updated the IEP
> draft with more details on the approach, hopefully now it's more clear. I
> think we will be able to take the best from both fixed-schema and
> schemaless approaches.
>
> вт, 1 сент. 2020 г. в 14:31, Ivan Pavlukhin :
>
> > Hi Val,
> >
> > Thank you for raising a discussion about this significant proposal!
> > The subject looks very significant and can greatly affect product
> > spirit and user experience.
> >
> > While I generally think that schema-first is a good idea, I would love
> > to see a thorough approaches comparison section. As we know different
> > databases treat data schema differently. And each way has benefits and
> > drawbacks. Additionally to schemeless and schema-first approaches I
> > remember talks about "dynamic schema". I believe that we should
> > describe clearly why do we prefer one approach over others.
> >
> > 2020-09-01 3:11 GMT+03:00, Valentin Kulichenko <
> > valentin.kuliche...@gmail.com>:
> > > Hi Denis,
> > >
> > > Generally speaking, I believe that the schema-first approach natively
> > > addresses the issue if duplicate fields in key and value objects,
> because
> > > schema will be created for a cache, not for an object, as it happens
> now.
> > > Basically, the schema will define whether there is a primary key or
> not,
> > > and which fields are included in case there is one. Any API that we
> would
> > > have must be compliant with this, so it becomes fairly easy to work
> with
> > > data as with a set of records, rather than key-value pairs.
> > >
> > > However, could you please elaborate on the relation between Ignite and
> > ORM?
> > > Is there a use case for Hibernate running on top of Ignite (I haven't
> > seen
> > > one so far)? If so, what is missing exactly on the Ignite side to
> support
> > > this? In my understanding, all you need is SQL API which we already
> have.
> > > Am I missing something?
> > >
> > > -Val
> > >
> > > On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
> > >
> > >> Val,
> > >>
> > >> I would propose adding another point to the motivations list which is
> > >> related to the ORM frameworks such as Spring Data, Hibernate,
> Micronaut
> > >> and
> > >> many others.
> > >>
> > >> Presently, the storage engine requires to distinguish key objects from
> > >> the
> > >> value ones that complicate the usage of Ignite with those ORM
> frameworks
> > >> (especially if a key object comprises several fields). More on this
> can
> > >> be
> > >> found here:
> > >>
> > >>
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
> > >>
> > >> It will be nice if the new schema-first approach allows us to work
> with
> > a
> > >> single entity object when it comes to the ORMs. With no need to split
> > the
> > >> entity into a key and value. Just want to be sure that the Ignite 3.0
> > has
> > >> all the essential public APIs that would support the single-entity
> based
> > >> approach.
> > >>
> > >> What do you think?
> > >>
> > >> -
> > >> Denis
> > >>
> > >>
> > >> On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
> > >> valentin.kuliche...@gmail.com> wrote:
> > >>
> > >> > Igniters,
> > >> >
> > >> > One of the big changes proposed for Ignite 3.0 is the so-called
> > >> > "schema-first approach". To add more clarity, I've started writing
> the
> > >> IEP
> > >> > for this change:
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > >> >
> > >> > Please take a look and let me know if there are any immediate
> > thoughts,
> > >> > suggestions, or objections.
> > >> >
> > >> > -Val
> > >> >
> > >>
> > >
> >
> >
> > --
> >
> > Best regards,
> > Ivan Pavlukhin
> >
>


Re: IEP-54: Schema-first approach for 3.0

2020-09-01 Thread Alexey Goncharuk
Ivan,

Thank you for reminding me about the dynamic schema. I've updated the IEP
draft with more details on the approach, hopefully now it's more clear. I
think we will be able to take the best from both fixed-schema and
schemaless approaches.

вт, 1 сент. 2020 г. в 14:31, Ivan Pavlukhin :

> Hi Val,
>
> Thank you for raising a discussion about this significant proposal!
> The subject looks very significant and can greatly affect product
> spirit and user experience.
>
> While I generally think that schema-first is a good idea, I would love
> to see a thorough approaches comparison section. As we know different
> databases treat data schema differently. And each way has benefits and
> drawbacks. Additionally to schemeless and schema-first approaches I
> remember talks about "dynamic schema". I believe that we should
> describe clearly why do we prefer one approach over others.
>
> 2020-09-01 3:11 GMT+03:00, Valentin Kulichenko <
> valentin.kuliche...@gmail.com>:
> > Hi Denis,
> >
> > Generally speaking, I believe that the schema-first approach natively
> > addresses the issue if duplicate fields in key and value objects, because
> > schema will be created for a cache, not for an object, as it happens now.
> > Basically, the schema will define whether there is a primary key or not,
> > and which fields are included in case there is one. Any API that we would
> > have must be compliant with this, so it becomes fairly easy to work with
> > data as with a set of records, rather than key-value pairs.
> >
> > However, could you please elaborate on the relation between Ignite and
> ORM?
> > Is there a use case for Hibernate running on top of Ignite (I haven't
> seen
> > one so far)? If so, what is missing exactly on the Ignite side to support
> > this? In my understanding, all you need is SQL API which we already have.
> > Am I missing something?
> >
> > -Val
> >
> > On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
> >
> >> Val,
> >>
> >> I would propose adding another point to the motivations list which is
> >> related to the ORM frameworks such as Spring Data, Hibernate, Micronaut
> >> and
> >> many others.
> >>
> >> Presently, the storage engine requires to distinguish key objects from
> >> the
> >> value ones that complicate the usage of Ignite with those ORM frameworks
> >> (especially if a key object comprises several fields). More on this can
> >> be
> >> found here:
> >>
> >>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
> >>
> >> It will be nice if the new schema-first approach allows us to work with
> a
> >> single entity object when it comes to the ORMs. With no need to split
> the
> >> entity into a key and value. Just want to be sure that the Ignite 3.0
> has
> >> all the essential public APIs that would support the single-entity based
> >> approach.
> >>
> >> What do you think?
> >>
> >> -
> >> Denis
> >>
> >>
> >> On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
> >> valentin.kuliche...@gmail.com> wrote:
> >>
> >> > Igniters,
> >> >
> >> > One of the big changes proposed for Ignite 3.0 is the so-called
> >> > "schema-first approach". To add more clarity, I've started writing the
> >> IEP
> >> > for this change:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >> >
> >> > Please take a look and let me know if there are any immediate
> thoughts,
> >> > suggestions, or objections.
> >> >
> >> > -Val
> >> >
> >>
> >
>
>
> --
>
> Best regards,
> Ivan Pavlukhin
>


Re: IEP-54: Schema-first approach for 3.0

2020-09-01 Thread Ivan Pavlukhin
Hi Val,

Thank you for raising a discussion about this significant proposal!
The subject looks very significant and can greatly affect product
spirit and user experience.

While I generally think that schema-first is a good idea, I would love
to see a thorough approaches comparison section. As we know different
databases treat data schema differently. And each way has benefits and
drawbacks. Additionally to schemeless and schema-first approaches I
remember talks about "dynamic schema". I believe that we should
describe clearly why do we prefer one approach over others.

2020-09-01 3:11 GMT+03:00, Valentin Kulichenko :
> Hi Denis,
>
> Generally speaking, I believe that the schema-first approach natively
> addresses the issue if duplicate fields in key and value objects, because
> schema will be created for a cache, not for an object, as it happens now.
> Basically, the schema will define whether there is a primary key or not,
> and which fields are included in case there is one. Any API that we would
> have must be compliant with this, so it becomes fairly easy to work with
> data as with a set of records, rather than key-value pairs.
>
> However, could you please elaborate on the relation between Ignite and ORM?
> Is there a use case for Hibernate running on top of Ignite (I haven't seen
> one so far)? If so, what is missing exactly on the Ignite side to support
> this? In my understanding, all you need is SQL API which we already have.
> Am I missing something?
>
> -Val
>
> On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:
>
>> Val,
>>
>> I would propose adding another point to the motivations list which is
>> related to the ORM frameworks such as Spring Data, Hibernate, Micronaut
>> and
>> many others.
>>
>> Presently, the storage engine requires to distinguish key objects from
>> the
>> value ones that complicate the usage of Ignite with those ORM frameworks
>> (especially if a key object comprises several fields). More on this can
>> be
>> found here:
>>
>> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
>>
>> It will be nice if the new schema-first approach allows us to work with a
>> single entity object when it comes to the ORMs. With no need to split the
>> entity into a key and value. Just want to be sure that the Ignite 3.0 has
>> all the essential public APIs that would support the single-entity based
>> approach.
>>
>> What do you think?
>>
>> -
>> Denis
>>
>>
>> On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
>> valentin.kuliche...@gmail.com> wrote:
>>
>> > Igniters,
>> >
>> > One of the big changes proposed for Ignite 3.0 is the so-called
>> > "schema-first approach". To add more clarity, I've started writing the
>> IEP
>> > for this change:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>> >
>> > Please take a look and let me know if there are any immediate thoughts,
>> > suggestions, or objections.
>> >
>> > -Val
>> >
>>
>


-- 

Best regards,
Ivan Pavlukhin


Re: IEP-54: Schema-first approach for 3.0

2020-08-31 Thread Valentin Kulichenko
Hi Denis,

Generally speaking, I believe that the schema-first approach natively
addresses the issue if duplicate fields in key and value objects, because
schema will be created for a cache, not for an object, as it happens now.
Basically, the schema will define whether there is a primary key or not,
and which fields are included in case there is one. Any API that we would
have must be compliant with this, so it becomes fairly easy to work with
data as with a set of records, rather than key-value pairs.

However, could you please elaborate on the relation between Ignite and ORM?
Is there a use case for Hibernate running on top of Ignite (I haven't seen
one so far)? If so, what is missing exactly on the Ignite side to support
this? In my understanding, all you need is SQL API which we already have.
Am I missing something?

-Val

On Mon, Aug 31, 2020 at 2:08 PM Denis Magda  wrote:

> Val,
>
> I would propose adding another point to the motivations list which is
> related to the ORM frameworks such as Spring Data, Hibernate, Micronaut and
> many others.
>
> Presently, the storage engine requires to distinguish key objects from the
> value ones that complicate the usage of Ignite with those ORM frameworks
> (especially if a key object comprises several fields). More on this can be
> found here:
>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html
>
> It will be nice if the new schema-first approach allows us to work with a
> single entity object when it comes to the ORMs. With no need to split the
> entity into a key and value. Just want to be sure that the Ignite 3.0 has
> all the essential public APIs that would support the single-entity based
> approach.
>
> What do you think?
>
> -
> Denis
>
>
> On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Igniters,
> >
> > One of the big changes proposed for Ignite 3.0 is the so-called
> > "schema-first approach". To add more clarity, I've started writing the
> IEP
> > for this change:
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> >
> > Please take a look and let me know if there are any immediate thoughts,
> > suggestions, or objections.
> >
> > -Val
> >
>


Re: IEP-54: Schema-first approach for 3.0

2020-08-31 Thread Denis Magda
Val,

I would propose adding another point to the motivations list which is
related to the ORM frameworks such as Spring Data, Hibernate, Micronaut and
many others.

Presently, the storage engine requires to distinguish key objects from the
value ones that complicate the usage of Ignite with those ORM frameworks
(especially if a key object comprises several fields). More on this can be
found here:
http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Key-and-Value-fields-with-same-name-and-SQL-DML-td47557.html

It will be nice if the new schema-first approach allows us to work with a
single entity object when it comes to the ORMs. With no need to split the
entity into a key and value. Just want to be sure that the Ignite 3.0 has
all the essential public APIs that would support the single-entity based
approach.

What do you think?

-
Denis


On Fri, Aug 28, 2020 at 3:50 PM Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Igniters,
>
> One of the big changes proposed for Ignite 3.0 is the so-called
> "schema-first approach". To add more clarity, I've started writing the IEP
> for this change:
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
>
> Please take a look and let me know if there are any immediate thoughts,
> suggestions, or objections.
>
> -Val
>


IEP-54: Schema-first approach for 3.0

2020-08-28 Thread Valentin Kulichenko
Igniters,

One of the big changes proposed for Ignite 3.0 is the so-called
"schema-first approach". To add more clarity, I've started writing the IEP
for this change:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach

Please take a look and let me know if there are any immediate thoughts,
suggestions, or objections.

-Val