[ 
https://issues.apache.org/jira/browse/BEAM-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marek Simunek updated BEAM-6021:
--------------------------------
    Description: When using {{KryoSerializer}} we could improve serialization 
performance with registering internal classes used in SparkRunner.  (was: By 
default is set spark.serializer=org.apache.spark.serializer.JavaSerializer.

Because all objects from user are using Beam coders it will affect only 
internal objects for spark translation.

So why not use more optimal {{org.apache.spark.serializer.KryoSerializer}} and 
force spark runner contributors to register classes in 
{{BeamSparkRunnerRegistrator}} by setting 
{{spark.kryo.registrationRequired=true)}}.

More information about benefits of [kryo 
serialization|https://spark.apache.org/docs/latest/tuning.html#data-serialization]
 over java serializer.
)
        Summary: Registred more internal classes for kryo serialization  (was: 
Use Kryo spark.serializer instead of JavaSerializer)

> Registred more internal classes for kryo serialization
> ------------------------------------------------------
>
>                 Key: BEAM-6021
>                 URL: https://issues.apache.org/jira/browse/BEAM-6021
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark
>    Affects Versions: 2.8.0
>            Reporter: Marek Simunek
>            Assignee: Marek Simunek
>            Priority: Major
>             Fix For: 2.10.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When using {{KryoSerializer}} we could improve serialization performance with 
> registering internal classes used in SparkRunner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to