Re: Serialization in Apex

Vlad Rozov Tue, 17 May 2016 08:17:06 -0700

+1 for type based serialization. Tuples in most cases are flatrecords/pojo and it should be possible programmatically construct acodec that will significantly outperform Kryo. It should also reduceamount of data passed over the wire. I started to look in that directionas well as Kryo serialization is one of bottlenecks that limits Apexthroughput when operators are deployed into different containersincluding NODE_LOCAL case.


Thank you,
Vlad


On 5/17/16 07:13, Sandesh Hegde wrote:

If it is possible to serialize, platform should do it automatically, it
reduces the tribal knowledge requirement to use the platform. Couples of
month back, I also sent out the similar email.

Type based serialization may improve the performance.

On Tue, May 17, 2016, 6:06 AM Munagala Ramanath <[email protected]> wrote:

Traditionally, we've recommended using
"@DefaultSerializer(JavaSerializer.class)" or
"@FieldSerializer.Bind(CustomSerializer.class)" as outlined at

http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception

Can you describe why those approaches are not adequate ?

Ram

On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda <[email protected]>
wrote:

Hi All,

While working on the integration of Apex with Apache Samoa, I am coming
across some scenarios where I have to add default constructors in some
external classes to make them Kryo serializable. Although this should be
okay, we would like to avoid modifying external classes as far as

possible.

Some other streaming engines have taken different approaches towards
serialization.

I looked at Flink and Storm serialization mechanisms.

Storm has a fall back mechanism on Java serialization. It does use Kryo

for

serialization due to performance. But, if the class is not serializable
using Kryo, then it will try to serialize it using Java serialization. If
even then it cannot serialize, then it throws an error. [1]

Flink has its own serialization stack where it uses a serializer based on
the type information known about the data. [2]

What does the community think about the current state of serialization in
Apex. Is there a need to explore some approaches which could avoid
serialization issues such as the one described above? Are there any other
approaches one could use?

1.

http://storm.apache.org/releases/current/Serialization.html#java-serialization

2.

https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization


~Bhupesh

Re: Serialization in Apex

Reply via email to