Load from sequenceFile for PySpark is in master and save is in this PR underway (https://github.com/apache/spark/pull/1338)
I hope that Kan will have it ready to merge in time for 1.1 release window (it should be, the PR just needs a final review or two). In the meantime you can check out master and test out the sequenceFile load support in PySpark (there are examples in the /examples project and in python test, and some documentation in /docs) On Wed, Jul 23, 2014 at 4:42 PM, Gary Malouf <malouf.g...@gmail.com> wrote: > I am aware that today PySpark can not load sequence files directly. Are > there work-arounds people are using (short of duplicating all the data to > text files) for accessing this data? >