Custom string encoding

Valentin Kulichenko Fri, 30 Jun 2017 16:28:08 -0700

Folks,

Currently binary marshaller always encodes strings in UTF-8. However,
sometimes it can be useful to customize this. For example, if data contains
a lot of Cyrillic, Chinese or other symbols, but not so many Latin symbols,
memory is used very inefficiently. In this case it would be great to encode
most frequently used symbols in one byte instead of two or three.


I propose to introduce BinaryStringEncoder interface that will convert
strings to byte arrays and back, and make it pluggable via
BinaryConfiguration. This will allow users to plug in any encoding
algorithms based on their requirements.

Thoughts?

https://issues.apache.org/jira/browse/IGNITE-5655

-Val

Custom string encoding

Reply via email to