[ https://issues.apache.org/jira/browse/SPARK-32621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-32621: ------------------------------------ Assignee: (was: Apache Spark) > "path" option is added again to input paths during infer() > ---------------------------------------------------------- > > Key: SPARK-32621 > URL: https://issues.apache.org/jira/browse/SPARK-32621 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.6, 3.0.0, 3.0.1, 3.1.0 > Reporter: Terry Kim > Priority: Minor > > When "path" option is used when creating a DataFrame, it can cause issues > during infer. > {code:java} > class TestFileFilter extends PathFilter { > override def accept(path: Path): Boolean = path.getParent.getName != "p=2" > } > val path = "/tmp" > val df = spark.range(2) > df.write.json(path + "/p=1") > df.write.json(path + "/p=2") > val extraOptions = Map( > "mapred.input.pathFilter.class" -> classOf[TestFileFilter].getName, > "mapreduce.input.pathFilter.class" -> classOf[TestFileFilter].getName > ) > // This works fine. > assert(spark.read.options(extraOptions).json(path).count == 2) > // The following with "path" option fails with the following: > // assertion failed: Conflicting directory structures detected. Suspicious > paths > // file:/tmp > // file:/tmp/p=1 > assert(spark.read.options(extraOptions).format("json").option("path", > path).load.count() === 2) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org