Lin,

TL;DR

Do not use Kryo with Ignite.


We tried using Kryo a while back (pre-Binary Objects times) and it didn't work. 
To add insult to injury, it didn't work in the worst possible way: it would 
appear to work just fine, no exceptions or anything like that. But then you'd 
discover that, for example, a join query returns fewer rows than expected. It 
turns out that a replicated cache used on the right side of the join is 
actually missing data on some of the nodes of the Ignite cluster where the 
query runs. After some long and painful investigation, we concluded that Kryo 
was the culprit.


The reason is that as soon as you configure your own custom marshaller, Ignite 
starts using it for marshalling everything, including most of its own internal 
classes. Realize that your data is not stored in cache or transferred between 
the nodes directly. In all cases it's wrapped in Ignite internal classes that 
then get serialized/deserialized. Some of such internal classes are in fact 
have specialized readObject/writeObject, readResolve/writeReplace routines 
defined. By default, Kryo ignores such methods and simply tries to serialize 
the fields directly using its FieldSerializer, which of course doesn't always 
work. In order to make Kryo work with Ignite you'd have to register a specific 
Kryo Serializer and, in some cases, the Instantiator strategy for each internal 
serializable/externalizable Ignite class!


We didn't think such approach was feasible, so we switched to Binary Objects 
and are pretty happy with it. It is quite compact and sufficiently fast. The 
best thing about Binary Objects (at least for us) is the ability to access 
specific fields of the application data objects without going thru full 
deserialization. Overall, I believe Ignite provides sufficient means for making 
marshalling overhead as small as possible.


Regards

Andrey


________________________________
From: Lin <[email protected]>
Sent: Monday, July 18, 2016 7:54 PM
To: valentin.kulichenko
Subject: Re: How about adding kryo or protostuff as an optional marshaller?

Hi Val,

I post the codes in GitHub https://github.com/jackeylu/marshaller-cmp, you can 
run and compare it.

I am so glad that you can help me to choose the right serializes. I am not sure 
my cases is fair or not.

And from my tests, I found that,
1. in most of the case of primitive types or jdk.* types, protostuff not work 
better than ignite binary marshaller, but I think it does'n matter in real 
world.
2. in the case of user defined objects, protostuff can save average 40% 
capacity than ignite binary marshaller. Here the custom defined objects are 
MEDIA_CONTENT_1 and MEDIA_CONTENT_2 which are from 
https://github.com/eishay/jvm-serializers/blob/master/tpc/data/media.1.cks and 
https://github.com/eishay/jvm-serializers/blob/master/tpc/data/media.2.cks



------------------ Original ------------------
From:  "valentin.kulichenko";<[email protected]>;
Date:  Tue, Jul 19, 2016 06:01 AM
To:  "user"<[email protected]>;
Subject:  Re: How about adding kryo or protostuff as an optional marshaller?

Hi Lin,

Do you have a GitHub project that I can run and compare these two
marshallers? From these snippets it's not very clear what is actually
serialized.

Generally, Ignite does provide minimal overhead in the binary format, mainly
to allow field lookups without deserialization, which is crucial for SQL
queries, for example. However, even with this overhead, there is no much
difference in numbers. I believe that in most real use cases this difference
will be negligible.

However, you can always try to introduce custom serialization protocol.
Simply implement Marshaller interface and provide the implementation in
IgniteConfiguration.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/How-about-adding-kryo-or-protostuff-as-an-optional-marshaller-tp6309p6361.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to