[ https://issues.apache.org/jira/browse/SPARK-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-15473. ---------------------------------- Resolution: Cannot Reproduce Yup, I just double checked in the master too. Let me leave this resolved. > CSV fails to write and read back empty dataframe > ------------------------------------------------ > > Key: SPARK-15473 > URL: https://issues.apache.org/jira/browse/SPARK-15473 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Hyukjin Kwon > Priority: Major > > Currently CSV data source fails to write and read empty data. > The code below: > {code} > val emptyDf = spark.range(10).filter(_ => false) > emptyDf.write > .format("csv") > .save(path.getCanonicalPath) > val copyEmptyDf = spark.read > .format("csv") > .load(path.getCanonicalPath) > copyEmptyDf.show() > {code} > throws an exception below: > {code} > Can not create a Path from an empty string > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127) > at org.apache.hadoop.fs.Path.<init>(Path.java:135) > at org.apache.hadoop.util.StringUtils.stringToPath(StringUtils.java:241) > at > org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) > at > org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:987) > at > org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:987) > at > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:178) > at > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:178) > at scala.Option.map(Option.scala:146) > {code} > Note that this is a different case with the data below > {code} > val emptyDf = spark.createDataFrame(spark.sparkContext.emptyRDD[Row], schema) > {code} > In this case, any writer is not initialised and created. (no calls of > {{WriterContainer.writeRows()}}. > Maybe, it should be able to read/write header for schemas as well as empty > data. > For Parquet and JSON, it works but CSV does not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org