Lin,
TL;DR Do not use Kryo with Ignite. We tried using Kryo a while back (pre-Binary Objects times) and it didn't work. To add insult to injury, it didn't work in the worst possible way: it would appear to work just fine, no exceptions or anything like that. But then you'd discover that, for example, a join query returns fewer rows than expected. It turns out that a replicated cache used on the right side of the join is actually missing data on some of the nodes of the Ignite cluster where the query runs. After some long and painful investigation, we concluded that Kryo was the culprit. The reason is that as soon as you configure your own custom marshaller, Ignite starts using it for marshalling everything, including most of its own internal classes. Realize that your data is not stored in cache or transferred between the nodes directly. In all cases it's wrapped in Ignite internal classes that then get serialized/deserialized. Some of such internal classes are in fact have specialized readObject/writeObject, readResolve/writeReplace routines defined. By default, Kryo ignores such methods and simply tries to serialize the fields directly using its FieldSerializer, which of course doesn't always work. In order to make Kryo work with Ignite you'd have to register a specific Kryo Serializer and, in some cases, the Instantiator strategy for each internal serializable/externalizable Ignite class! We didn't think such approach was feasible, so we switched to Binary Objects and are pretty happy with it. It is quite compact and sufficiently fast. The best thing about Binary Objects (at least for us) is the ability to access specific fields of the application data objects without going thru full deserialization. Overall, I believe Ignite provides sufficient means for making marshalling overhead as small as possible. Regards Andrey ________________________________ From: Lin <[email protected]> Sent: Monday, July 18, 2016 7:54 PM To: valentin.kulichenko Subject: Re: How about adding kryo or protostuff as an optional marshaller? Hi Val, I post the codes in GitHub https://github.com/jackeylu/marshaller-cmp, you can run and compare it. I am so glad that you can help me to choose the right serializes. I am not sure my cases is fair or not. And from my tests, I found that, 1. in most of the case of primitive types or jdk.* types, protostuff not work better than ignite binary marshaller, but I think it does'n matter in real world. 2. in the case of user defined objects, protostuff can save average 40% capacity than ignite binary marshaller. Here the custom defined objects are MEDIA_CONTENT_1 and MEDIA_CONTENT_2 which are from https://github.com/eishay/jvm-serializers/blob/master/tpc/data/media.1.cks and https://github.com/eishay/jvm-serializers/blob/master/tpc/data/media.2.cks ------------------ Original ------------------ From: "valentin.kulichenko";<[email protected]>; Date: Tue, Jul 19, 2016 06:01 AM To: "user"<[email protected]>; Subject: Re: How about adding kryo or protostuff as an optional marshaller? Hi Lin, Do you have a GitHub project that I can run and compare these two marshallers? From these snippets it's not very clear what is actually serialized. Generally, Ignite does provide minimal overhead in the binary format, mainly to allow field lookups without deserialization, which is crucial for SQL queries, for example. However, even with this overhead, there is no much difference in numbers. I believe that in most real use cases this difference will be negligible. However, you can always try to introduce custom serialization protocol. Simply implement Marshaller interface and provide the implementation in IgniteConfiguration. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-about-adding-kryo-or-protostuff-as-an-optional-marshaller-tp6309p6361.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
