can you please elaborate a bit more?




On Wed, Aug 24, 2016 12:41 AM, Sean Owen so...@cloudera.com wrote:
Byte code, no. It's sufficient to store the information that the RDD represents,
which can include serialized function closures, but that's not quite storing
byte code.
On Wed, Aug 24, 2016 at 2:00 AM, kant kodali < kanth...@gmail.com > wrote:
Hi Guys,
I have this question for a very long time and after diving into the source
code(specifically from the links below) I have a feeling that the lineage of an
RDD (the transformations) are converted into byte code and stored in memory or
disk. or if I were to ask another question on a similar note do we ever store
JVM byte code or python byte code in memory or disk? This make sense to me
because if we were to construct an RDD after a node failure we need to go
through the lineage and execute the respective transformations so storing their
byte codes does make sense however many people seem to disagree with me so it
would be great if someone can clarify.
https://github.com/apache/ spark/blob/ 6ee40d2cc5f467c78be662c1639fc3
d5b7f796cf/python/pyspark/rdd. py#L1452
https://github.com/apache/ spark/blob/ 6ee40d2cc5f467c78be662c1639fc3
d5b7f796cf/python/pyspark/rdd. py#L1471
https://github.com/apache/ spark/blob/ 6ee40d2cc5f467c78be662c1639fc3
d5b7f796cf/python/pyspark/rdd. py#L229
https://github.com/apache/ spark/blob/master/python/ pyspark/cloudpickle.py#L241

Reply via email to