This blog post(Not mine) has some nice examples - 

https://hadoopist.wordpress.com/2016/08/19/how-to-create-compressed-output-files-in-spark-2-0/

>From the blog - 
df.write.mode("overwrite").format("parquet").option("compression",
"none").mode("overwrite").save("/tmp/file_no_compression_parq")
    df.write.mode("overwrite").format("parquet").option("compression",
"gzip").mode("overwrite").save("/tmp/file_with_gzip_parq")
    df.write.mode("overwrite").format("parquet").option("compression",
"snappy").mode("overwrite").save("/tmp/file_with_snappy_parq")
    //lzo - requires a different method in terms of implementation.

    df.write.mode("overwrite").format("orc").option("compression",
"none").mode("overwrite").save("/tmp/file_no_compression_orc")
    df.write.mode("overwrite").format("orc").option("compression",
"snappy").mode("overwrite").save("/tmp/file_with_snappy_orc")
df.write.mode("overwrite").format("orc").option("compression",
"zlib").mode("overwrite").save("/tmp/file_with_zlib_orc")



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Setting-Spark-Properties-on-Dataframes-tp28266p28280.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to