Spark reported error java.lang.IllegalArgumentException with messages: java.lang.IllegalArgumentException: requirement failed: Found fields with the same name. at scala.Predef$.require(Predef.scala:233) at org.apache.spark.sql.catalyst.types.StructType.<init>(dataTypes.scala:317) at org.apache.spark.sql.catalyst.types.StructType$.fromAttributes(dataTypes.scala:310) at org.apache.spark.sql.parquet.ParquetTypesConverter$.convertToString(ParquetTypes.scala:306) at org.apache.spark.sql.parquet.ParquetTableScan.execute(ParquetTableOperations.scala:83) at org.apache.spark.sql.execution.Filter.execute(basicOperators.scala:57) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:85) at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:433)
After trial and error, it seems it's caused by duplicated columns in my select clause. I made the duplication on purpose for my code to parse correctly. I think we should allow users to specify duplicated columns as return value. -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/