Hi,
I have a scenario where I'd like to store a RDD using parquet format in
many files, which corresponds to days, such as 2015/01/01, 2015/02/02, etc.
So far I used this method
http://stackoverflow.com/questions/23995040/write-to-multiple-outputs-by-key-spark-one-spark-job
to store text files
Spark 1.4 supports dynamic partitioning, you can first convert your RDD
to a DataFrame and then save the contents partitioned by date column.
Say you have a DataFrame df containing three columns a, b, and c, you
may have something like this:
df.write.partitionBy(a,