[
https://issues.apache.org/jira/browse/SPARK-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564219#comment-14564219
]
Josh Rosen commented on SPARK-7708:
-----------------------------------
I invested the time to dig into this because I was worried that this issue
might impact us in 1.4 due to our increased serializer reuse. On closer
analysis, though, I think we're safe. In 1.3.x, it appears that there are some
cases where the old could _would_ re-use the same SerializerInstance and make
multiple `serialize()` calls using the same `Output`. If the bug didn’t
manifest in those older versions and we didn’t introduce any new cases of this
pattern in 1.4.0, then I don’t think we need to take any additional action for
1.4.
It might be good to have someone else confirm this, though, but my quick
glances through IntelliJ suggest that things are okay.
Regarding upgrading Kryo, there may be some considerations due to our use of
Chill. I'm not sure whether chill supports Kryo 3.x. We also need to be
careful to not introduce bugs / regressions by upgrading to 2.23. Definitely
give 2.23.0 a try, though, and let me know if it fixes the problem. If it
does, you can modify your PR to bump to that version and try to copy the code
from my gist into a KryoSerializerSuite regression test.
> Incorrect task serialization with Kryo closure serializer
> ---------------------------------------------------------
>
> Key: SPARK-7708
> URL: https://issues.apache.org/jira/browse/SPARK-7708
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.2
> Reporter: Akshat Aranya
>
> I've been investigating the use of Kryo for closure serialization with Spark
> 1.2, and it seems like I've hit upon a bug:
> When a task is serialized before scheduling, the following log message is
> generated:
> [info] o.a.s.s.TaskSetManager - Starting task 124.1 in stage 0.0 (TID 342,
> <host>, PROCESS_LOCAL, 302 bytes)
> This message comes from TaskSetManager which serializes the task using the
> closure serializer. Before the message is sent out, the TaskDescription
> (which included the original task as a byte array), is serialized again into
> a byte array with the closure serializer. I added a log message for this in
> CoarseGrainedSchedulerBackend, which produces the following output:
> [info] o.a.s.s.c.CoarseGrainedSchedulerBackend - 124.1 size=132
> The serialized size of TaskDescription (132 bytes) turns out to be _smaller_
> than serialized task that it contains (302 bytes). This implies that
> TaskDescription.buffer is not getting serialized correctly.
> On the executor side, the deserialization produces a null value for
> TaskDescription.buffer.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]