[
https://issues.apache.org/jira/browse/BEAM-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marek Simunek updated BEAM-6021:
--------------------------------
Description: When using {{KryoSerializer}} we could improve serialization
performance with registering internal classes used in SparkRunner. (was: By
default is set spark.serializer=org.apache.spark.serializer.JavaSerializer.
Because all objects from user are using Beam coders it will affect only
internal objects for spark translation.
So why not use more optimal {{org.apache.spark.serializer.KryoSerializer}} and
force spark runner contributors to register classes in
{{BeamSparkRunnerRegistrator}} by setting
{{spark.kryo.registrationRequired=true)}}.
More information about benefits of [kryo
serialization|https://spark.apache.org/docs/latest/tuning.html#data-serialization]
over java serializer.
)
Summary: Registred more internal classes for kryo serialization (was:
Use Kryo spark.serializer instead of JavaSerializer)
> Registred more internal classes for kryo serialization
> ------------------------------------------------------
>
> Key: BEAM-6021
> URL: https://issues.apache.org/jira/browse/BEAM-6021
> Project: Beam
> Issue Type: Improvement
> Components: runner-spark
> Affects Versions: 2.8.0
> Reporter: Marek Simunek
> Assignee: Marek Simunek
> Priority: Major
> Fix For: 2.10.0
>
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> When using {{KryoSerializer}} we could improve serialization performance with
> registering internal classes used in SparkRunner.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)