Vladimir,
> When we finish varlen optimization for string lengths, I am afraid we
could
> end up with very messy protocol, should we mix encoded length and
encoding.
I agree, we shouldn't mix it.

> I deemed it's unusual to make two differerent type markers (flags) for
> single datatype. I can't see the source right now
Theoretically, you can combine GridBinaryMarshaller.STRING with
BinaryWriteMode.
I agree with Vladimir, way of addition of new type is the the most clear
for me.


> Encoding must be set on per field basis. This will give us as most
flexible
> solution at the cost of 1-byte overhead.

> Vova, I agree that the encoding should be set on per-field basis, but at
> the table level, not at a cell level.

Dmitriy, Vladimir,
Let's use both approaches :-)
We can add parameter to CacheConfiguration.
If parameter specifie to use cache level encoding then marshaller will use
encoding in a cache,
otherwise marshaller will use per-field encoding.
Of course only if it doesn't complicate the solution.


2017-07-25 20:44 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> On Tue, Jul 25, 2017 at 12:36 PM, Vladimir Ozerov <voze...@gridgain.com>
> wrote:
>
> > Vyacheslav,
> > When we finish varlen optimization for string lengths, I am afraid we
> could
> > end up with very messy protocol, should we mix encoded length and
> encoding.
> >
> > Dima,
> > Encoding must be set on per field basis. This will give us as most
> flexible
> > solution at the cost of 1-byte overhead.
> >
>
> Vova, I agree that the encoding should be set on per-field basis, but at
> the table level, not at a cell level. I cannot foresee a situation where we
> would have different encodings in the same column. If that ever happens,
> then user can provide already encoded values.
>
>
> >
> > вт, 25 июля 2017 г. в 20:23, Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > I don't understand why this encoding is done on per-object and not on
> > > per-cache level. Shouldn't the column-to-encoding mapping be defined at
> > > cache level configuration?
> > >
> > > On Tue, Jul 25, 2017 at 12:13 PM, Vladimir Ozerov <
> voze...@gridgain.com>
> > > wrote:
> > >
> > > > Andrey,
> > > >
> > > > You cannot have optional part in the middle as it will break
> > > compatibility
> > > > in dangerous way, probably leading to node crash. Also having INT (4
> > > bytes)
> > > > looks too much for me.
> > > >
> > > > Instead, I would add new type "encoded string":
> > > > 1 byte - type
> > > > 1 byte - encoding code, map frequently used encodings to some byte
> > value;
> > > > also have a special value, meaning that encoding will be written as
> > > string
> > > > afterwards, this way we will support any encoding out of the box
> > > > [optional] encoding name
> > > > 4 bytes - string length
> > > > Finally - string bytes
> > > >
> > > > Vladimir.
> > > >
> > > > вт, 25 июля 2017 г. в 18:24, Andrey Kuznetsov <stku...@gmail.com>:
> > > >
> > > > > I apologize for damaged formatting. Below is my message as it
> should
> > > be.
> > > > >
> > > > >
> > > > > Hi Igniters,
> > > > >
> > > > > I'd like to discuss future changes related to
> > > https://issues.apache.org/
> > > > > jira/browse/IGNITE-5655
> > > > > <https://issues.apache.org/jira/browse/IGNITE-5655>.
> > > > >
> > > > > Is it really good idea to introduce new flag (ENCODED_STRING) for
> > > > existing
> > > > > String datatype? It's possible to use existing STRING flag at
> > > negligible
> > > > > performance cost.
> > > > >
> > > > > Currently, utf-8-encoded string looks like
> > > > >
> > > > > byteFlag nonNegativeIntStrLen bytes
> > > > >
> > > > > This format can be backward compatibly extended to
> > > > >
> > > > > byteFlag [negativeIntCharsetCode] nonNegativeIntStrLen bytes
> > > > >
> > > > > Next, I suggest to add new BinaryConfiguration property for
> encoding
> > to
> > > > use
> > > > > instead of using global property. It seems to be more convenient
> for
> > > > user.
> > > > >
> > > > > I'll appreciate your feedback.
> > > > >
> > > > > 2017-07-25 16:13 GMT+03:00 Andrey Kuznetsov <stku...@gmail.com>:
> > > > >
> > > > > > Hi Igniters,I'd like to discuss future changes related to
> > > IGNITE-5655
> > > > > > <https://issues.apache.org/jira/browse/IGNITE-5655>  . Is it
> > really
> > > > good
> > > > > > idea to introduce new flag (ENCODED_STRING) for existing String
> > > > datatype?
> > > > > > It's possible to use existing STRING flag at negligible
> performance
> > > > cost.
> > > > > > Currently, utf-8-encoded string looks like
> > > > > > byteFlag nonNegativeIntStrLen bytes
> > > > > > This format can be backward compatibly extended to
> > > > > > byteFlag [negativeIntCharsetCode] nonNegativeIntStrLen bytes
> > > > > > Next, I suggest to add new BinaryConfiguration property for
> > encoding
> > > to
> > > > > use
> > > > > > instead of using global property. It seems to be more convenient
> > for
> > > > > > user.I'll appreciate your feedback.
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----
> > > > > > Best regards,
> > > > > >   Andrey Kuznetsov.
> > > > > > --
> > > > > > View this message in context: http://apache-ignite-
> > > > > > developers.2346864.n4.nabble.com/Non-UTF-8-string-encoding-
> > > > > > support-in-BinaryMarshaller-IGNITE-5655-tp20024.html
> > > > > > Sent from the Apache Ignite Developers mailing list archive at
> > > > > Nabble.com.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > >   Andrey Kuznetsov.
> > > > >
> > > >
> > >
> >
>



-- 
Best Regards, Vyacheslav D.

Reply via email to