Hi, I'm having problems writing dataframes with pyspark 1.6.0. If I create a small dataframe like:
sqlContext.createDataFrame(pandas.DataFrame.from_dict([{'x': 1}])).write.orc('test-orc') Only the _SUCCESS file in the output directory is written. The executor log shows the saved output of the task being written under test-orc/_temporary/. Writing with parquet rather than orc, I have the same output (a _SUCCESS file, no parts), but there's also an exception java.lang.NullPointerException at org.apache.parquet.hadoop.ParquetFileWriter.mergeFooters(Par quetFileWriter.java:456) matching "Writing empty Dataframes doesn't save any _metadata files" https://issues.apache.org/jira/browse/SPARK-15393 If I do the equivalent in Scala, things work as expected. Any suggestions what could be happening? Much appreciated --Ethan