[ https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622842#comment-15622842 ]
Amit Sela commented on BEAM-626: -------------------------------- Those objects rely on Java serialization which is outperformed by Kryo serialization. This may not be an issue for other frameworks as they serialize tasks (The DoFn and it's content) to the worker because it happens every time a new worker is deployed/utilized but Spark streaming has to do this every batch interval (sometimes as frequent as 500 msec) so it is better to use Kryo. So far {{AvroCoder}} is the only issue, and the entire Spark community (which is one of the largest Apache communities) deals with Kryo just fine. As discussed before, if this becomes a pain I could expose an API to register serializers (falling back to {{JavaSerialization}} is one of them). What I really don't understand is the stand on this ticket - "AvroCoder not deserializing correctly in Kryo" - as in making the coder work with Kryo.. If the proposed solution was to degrade the quality/performance of the current state of the implementation I would be the first to suggest we simply register with Java but this is not the case. First proposal wasn't threadsafe, which is not different from the current state, and adding synchronization might heart performance (regardless of Kryo). >From my point of view, enabling the coder to work with Kryo while NOT making >it worse in any way, is a good thing. > AvroCoder not deserializing correctly in Kryo > --------------------------------------------- > > Key: BEAM-626 > URL: https://issues.apache.org/jira/browse/BEAM-626 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Reporter: Aviem Zur > Assignee: Aviem Zur > Priority: Minor > > Unlike with Java serialization, when deserializing AvroCoder using Kryo, the > resulting AvroCoder is missing all of its transient fields. > The reason it works with Java serialization is because of the usage of > writeReplace and readResolve, which Kryo does not adhere to. > In ProtoCoder for example there are also unserializable members, the way it > is solved there is lazy initializing these members via their getters, so they > are initialized in the deserialized object on first call to the member. > It seems AvroCoder is the only class in Beam to use writeReplace convention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)