Vladimir,

It's rather simple to support string encoding by setting it in
BinaryConfiguration. But I'm unsure whether it's a desired change. We need
to express our goal more precisely: should we control encoding at cache
level, field level, or binary configuration level? Currently,
BinaryMarshaller is controlled only by BinaryConfiguration and it's hard
for me to estimate changes to bring string encoding, say, to per-cache
basis.

2017-07-25 20:17 GMT+03:00 Vladimir Ozerov [via Apache Ignite Developers] <
ml+s2346864n20046...@n4.nabble.com>:

> Vyacheslav,
> When we finish varlen optimization for string lengths, I am afraid we
> could
> end up with very messy protocol, should we mix encoded length and
> encoding.
>
> Dima,
> Encoding must be set on per field basis. This will give us as most
> flexible
> solution at the cost of 1-byte overhead.
>
> вт, 25 июля 2017 г. в 20:23, Dmitriy Setrakyan <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=20046&i=0>>:
>
> > I don't understand why this encoding is done on per-object and not on
> > per-cache level. Shouldn't the column-to-encoding mapping be defined at
> > cache level configuration?
> >
> > On Tue, Jul 25, 2017 at 12:13 PM, Vladimir Ozerov <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=20046&i=1>>
> > wrote:
> >
> > > Andrey,
> > >
> > > You cannot have optional part in the middle as it will break
> > compatibility
> > > in dangerous way, probably leading to node crash. Also having INT (4
> > bytes)
> > > looks too much for me.
> > >
> > > Instead, I would add new type "encoded string":
> > > 1 byte - type
> > > 1 byte - encoding code, map frequently used encodings to some byte
> value;
> > > also have a special value, meaning that encoding will be written as
> > string
> > > afterwards, this way we will support any encoding out of the box
> > > [optional] encoding name
> > > 4 bytes - string length
> > > Finally - string bytes
> > >
> > > Vladimir.
> > >
> > > вт, 25 июля 2017 г. в 18:24, Andrey Kuznetsov <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=20046&i=2>>:
> > >
> > > > I apologize for damaged formatting. Below is my message as it should
> > be.
> > > >
> > > >
> > > > Hi Igniters,
> > > >
> > > > I'd like to discuss future changes related to
> > https://issues.apache.org/
> > > > jira/browse/IGNITE-5655
> > > > <https://issues.apache.org/jira/browse/IGNITE-5655>.
> > > >
> > > > Is it really good idea to introduce new flag (ENCODED_STRING) for
> > > existing
> > > > String datatype? It's possible to use existing STRING flag at
> > negligible
> > > > performance cost.
> > > >
> > > > Currently, utf-8-encoded string looks like
> > > >
> > > > byteFlag nonNegativeIntStrLen bytes
> > > >
> > > > This format can be backward compatibly extended to
> > > >
> > > > byteFlag [negativeIntCharsetCode] nonNegativeIntStrLen bytes
> > > >
> > > > Next, I suggest to add new BinaryConfiguration property for encoding
> to
> > > use
> > > > instead of using global property. It seems to be more convenient for
> > > user.
> > > >
> > > > I'll appreciate your feedback.
> > > >
> > > > 2017-07-25 16:13 GMT+03:00 Andrey Kuznetsov <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=20046&i=3>>:
> > > >
> > > > > Hi Igniters,I'd like to discuss future changes related to
> > IGNITE-5655
> > > > > <https://issues.apache.org/jira/browse/IGNITE-5655>  . Is it
> really
> > > good
> > > > > idea to introduce new flag (ENCODED_STRING) for existing String
> > > datatype?
> > > > > It's possible to use existing STRING flag at negligible
> performance
> > > cost.
> > > > > Currently, utf-8-encoded string looks like
> > > > > byteFlag nonNegativeIntStrLen bytes
> > > > > This format can be backward compatibly extended to
> > > > > byteFlag [negativeIntCharsetCode] nonNegativeIntStrLen bytes
> > > > > Next, I suggest to add new BinaryConfiguration property for
> encoding
> > to
> > > > use
> > > > > instead of using global property. It seems to be more convenient
> for
> > > > > user.I'll appreciate your feedback.
> > > > >
> > > > >
> > > > >
> > > > > -----
> > > > > Best regards,
> > > > >   Andrey Kuznetsov.
> > > > > --
> > > > > View this message in context: http://apache-ignite-
> > > > > developers.2346864.n4.nabble.com/Non-UTF-8-string-encoding-
> > > > > support-in-BinaryMarshaller-IGNITE-5655-tp20024.html
> > > > > Sent from the Apache Ignite Developers mailing list archive at
> > > > Nabble.com.
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >   Andrey Kuznetsov.
> > > >
> > >
> >
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-developers.2346864.n4.nabble.
> com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller-
> IGNITE-5655-tp20024p20046.html
> To unsubscribe from Non-UTF-8 string encoding support in BinaryMarshaller
> (IGNITE-5655), click here
> <http://apache-ignite-developers.2346864.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=20024&code=c3RrdXptYUBnbWFpbC5jb218MjAwMjR8LTUwMjc0NDk4NA==>
> .
> NAML
> <http://apache-ignite-developers.2346864.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
Best regards,
  Andrey Kuznetsov.




--
View this message in context: 
http://apache-ignite-developers.2346864.n4.nabble.com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller-IGNITE-5655-tp20024p20084.html
Sent from the Apache Ignite Developers mailing list archive at Nabble.com.

Reply via email to