Re: Custom string encoding

Valentin Kulichenko Fri, 30 Jun 2017 18:50:02 -0700

Andrey,

Can you elaborate more on this? What is your concern?


-Val

On Fri, Jun 30, 2017 at 6:17 PM Andrey Mashenkov <[email protected]>
wrote:

> Val,
>
> Looks like make sense.
>
> This will not affect FullText index, as Lucene has own format for storing
> data.
>
> But.. would it be compatible with H2 indexing ? I doubt.
>
> 1 июля 2017 г. 2:27 пользователь "Valentin Kulichenko" <
> [email protected]> написал:
>
> > Folks,
> >
> > Currently binary marshaller always encodes strings in UTF-8. However,
> > sometimes it can be useful to customize this. For example, if data
> contains
> > a lot of Cyrillic, Chinese or other symbols, but not so many Latin
> symbols,
> > memory is used very inefficiently. In this case it would be great to
> encode
> > most frequently used symbols in one byte instead of two or three.
> >
> > I propose to introduce BinaryStringEncoder interface that will convert
> > strings to byte arrays and back, and make it pluggable via
> > BinaryConfiguration. This will allow users to plug in any encoding
> > algorithms based on their requirements.
> >
> > Thoughts?
> >
> > https://issues.apache.org/jira/browse/IGNITE-5655
> >
> > -Val
> >
>

Re: Custom string encoding

Reply via email to