Until then you can try sql("SET spark.sql.parquet.useDataSourceApi=false")
On Wed, Mar 25, 2015 at 12:15 PM, Michael Armbrust <mich...@databricks.com> wrote: > This will be fixed in Spark 1.3.1: > https://issues.apache.org/jira/browse/SPARK-6351 > > and is fixed in master/branch-1.3 if you want to compile from source > > On Wed, Mar 25, 2015 at 11:59 AM, Stuart Layton <stuart.lay...@gmail.com> > wrote: > >> I'm trying to save a dataframe to s3 as a parquet file but I'm getting >> Wrong FS errors >> >> >>> df.saveAsParquetFile(parquetFile) >> 15/03/25 18:56:10 INFO storage.MemoryStore: ensureFreeSpace(46645) called >> with curMem=82744, maxMem=278302556 >> 15/03/25 18:56:10 INFO storage.MemoryStore: Block broadcast_5 stored as >> values in memory (estimated size 45.6 KB, free 265.3 MB) >> 15/03/25 18:56:10 INFO storage.MemoryStore: ensureFreeSpace(7078) called >> with curMem=129389, maxMem=278302556 >> 15/03/25 18:56:10 INFO storage.MemoryStore: Block broadcast_5_piece0 >> stored as bytes in memory (estimated size 6.9 KB, free 265.3 MB) >> 15/03/25 18:56:10 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 >> in memory on ip-172-31-1-219.ec2.internal:58280 (size: 6.9 KB, free: 265.4 >> MB) >> 15/03/25 18:56:10 INFO storage.BlockManagerMaster: Updated info of block >> broadcast_5_piece0 >> 15/03/25 18:56:10 INFO spark.SparkContext: Created broadcast 5 from >> textFile at JSONRelation.scala:98 >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> File "/root/spark/python/pyspark/sql/dataframe.py", line 121, in >> saveAsParquetFile >> self._jdf.saveAsParquetFile(path) >> File >> "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line >> 538, in __call__ >> File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", >> line 300, in get_return_value >> py4j.protocol.Py4JJavaError: An error occurred while calling >> o22.saveAsParquetFile. >> : java.lang.IllegalArgumentException: Wrong FS: >> s3n://com.my.bucket/spark-testing/, expected: hdfs:// >> ec2-52-0-159-113.compute-1.amazonaws.com:9000 >> >> >> Is it possible to save a dataframe to s3 directly using parquet? >> >> -- >> Stuart Layton >> > >