There's an open pull request to add support for additional Hadoop file formats to PySpark: https://github.com/apache/incubator-spark/pull/263
On Thu, Jan 9, 2014 at 8:15 AM, Diana Carroll <[email protected]> wrote: > Hello! I'm exploring using custom input formats, which it seems I can do > in Scala using sc.hadoopNewAPIFile or sc.hadoopNewAPIRDD. > > My question is: is it possible to do this in Python? The Python API > doesn't have (AFAICT) the sc.hadoop* functions. > > Thanks, > Diana >
