Cool, thanks for the link.
Bertrand Dechoux
On Mon, Apr 21, 2014 at 7:31 PM, Nick Pentreath nick.pentre...@gmail.comwrote:
Also see: https://github.com/apache/spark/pull/455
This will add support for reading sequencefile and other inputformat in
PySpark, as long as the Writables are either
Hi,
I have browsed the online documentation and it is stated that PySpark only
read text files as sources. Is it still the case?
From what I understand, the RDD can after this first step be any serialized
python structure if the class definitions are well distributed.
Is it not possible to read
Hi Bertrand,
We should probably add a SparkContext.pickleFile and RDD.saveAsPickleFile that
will allow saving pickled objects. Unfortunately this is not in yet, but there
is an issue up to track it: https://issues.apache.org/jira/browse/SPARK-1161.
In 1.0, one feature we do have now is the
When this is implemented, can you load/save an RDD of pickled objects to
HDFS?
On Thu, Apr 17, 2014 at 1:51 AM, Matei Zaharia matei.zaha...@gmail.comwrote:
Hi Bertrand,
We should probably add a SparkContext.pickleFile and RDD.saveAsPickleFile
that will allow saving pickled objects.