You can use the saveAsNewAPIHadoop <http://spark.apache.org/docs/1.1.0/api/python/pyspark.rdd.RDD-class.html#saveAsNewAPIHadoopFile> file. You can use it for compressing your output, here's a sample code <https://github.com/ScrapCodes/spark-1/blob/master/python/pyspark/tests.py#L1225> to use the API.
Thanks Best Regards On Thu, Jan 15, 2015 at 5:16 PM, Tom Seddon <mr.tom.sed...@gmail.com> wrote: > Hi, > > I've searched but can't seem to find a PySpark example. How do I write > compressed text file output to S3 using PySpark saveAsTextFile? > > Thanks, > > Tom >