[ 
https://issues.apache.org/jira/browse/SPARK-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Massie updated SPARK-9043:
-------------------------------
    Description: 
ShuffleManager implementations are currently not given type information 
regarding the key, value and combiner classes. Serialization of shuffle objects 
relies on them being JavaSerializable, with methods defined for reading/writing 
the object or, alternatively, serialization via Kryo which uses reflection.

Serialization systems like Avro, Thrift and Protobuf generate classes with zero 
argument constructors and explicit schema information (e.g. IndexedRecords in 
Avro have get, put and getSchema methods).

By serializing the key, value and combiner class names in ShuffleDependency, 
shuffle implementations will have access to schema information when 
registerShuffle() is called.

  was:
ShuffleManager implementations are currently not given type information 
regarding the key, value and combiner classes. Serialization of shuffle objects 
relies on them being JavaSerializable, with methods defined for reading/writing 
the object or, alternatively, serialization via Kryo which uses reflection.

Serialization systems like Avro, Thrift and Protobuf generate classes with zero 
argument constructors and explicit schema information (e.g. IndexedRecords in 
Avro have get, put and getSchema methods).

By serializing the key, value and combiner class names in {ShuffleDependency}, 
shuffle implementations will have access to schema information when 
{registerShuffle} is called.


> Serialize key, value and combiner classes in ShuffleDependency
> --------------------------------------------------------------
>
>                 Key: SPARK-9043
>                 URL: https://issues.apache.org/jira/browse/SPARK-9043
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Matt Massie
>
> ShuffleManager implementations are currently not given type information 
> regarding the key, value and combiner classes. Serialization of shuffle 
> objects relies on them being JavaSerializable, with methods defined for 
> reading/writing the object or, alternatively, serialization via Kryo which 
> uses reflection.
> Serialization systems like Avro, Thrift and Protobuf generate classes with 
> zero argument constructors and explicit schema information (e.g. 
> IndexedRecords in Avro have get, put and getSchema methods).
> By serializing the key, value and combiner class names in ShuffleDependency, 
> shuffle implementations will have access to schema information when 
> registerShuffle() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to