This will be fixed in Spark 1.3.1: https://issues.apache.org/jira/browse/SPARK-6351
and is fixed in master/branch-1.3 if you want to compile from source On Wed, Mar 25, 2015 at 11:59 AM, Stuart Layton <stuart.lay...@gmail.com> wrote: > I'm trying to save a dataframe to s3 as a parquet file but I'm getting > Wrong FS errors > > >>> df.saveAsParquetFile(parquetFile) > 15/03/25 18:56:10 INFO storage.MemoryStore: ensureFreeSpace(46645) called > with curMem=82744, maxMem=278302556 > 15/03/25 18:56:10 INFO storage.MemoryStore: Block broadcast_5 stored as > values in memory (estimated size 45.6 KB, free 265.3 MB) > 15/03/25 18:56:10 INFO storage.MemoryStore: ensureFreeSpace(7078) called > with curMem=129389, maxMem=278302556 > 15/03/25 18:56:10 INFO storage.MemoryStore: Block broadcast_5_piece0 > stored as bytes in memory (estimated size 6.9 KB, free 265.3 MB) > 15/03/25 18:56:10 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 > in memory on ip-172-31-1-219.ec2.internal:58280 (size: 6.9 KB, free: 265.4 > MB) > 15/03/25 18:56:10 INFO storage.BlockManagerMaster: Updated info of block > broadcast_5_piece0 > 15/03/25 18:56:10 INFO spark.SparkContext: Created broadcast 5 from > textFile at JSONRelation.scala:98 > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/root/spark/python/pyspark/sql/dataframe.py", line 121, in > saveAsParquetFile > self._jdf.saveAsParquetFile(path) > File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", > line 538, in __call__ > File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", > line 300, in get_return_value > py4j.protocol.Py4JJavaError: An error occurred while calling > o22.saveAsParquetFile. > : java.lang.IllegalArgumentException: Wrong FS: > s3n://com.my.bucket/spark-testing/, expected: hdfs:// > ec2-52-0-159-113.compute-1.amazonaws.com:9000 > > > Is it possible to save a dataframe to s3 directly using parquet? > > -- > Stuart Layton >