[ https://issues.apache.org/jira/browse/SPARK-43343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim reassigned SPARK-43343: ------------------------------------ Assignee: Siying Dong > Spark Streaming is not able to read a .txt file whose name has [] special > character > ----------------------------------------------------------------------------------- > > Key: SPARK-43343 > URL: https://issues.apache.org/jira/browse/SPARK-43343 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 3.4.0 > Reporter: Siying Dong > Assignee: Siying Dong > Priority: Minor > > * For example, If a directory contains a following file: > /path/abc[123] > and users would load spark.readStream.format("text").load("/path") as stream > input. It throws an exception, saying no matching path /path/abc[123]. Spark > thinks abc[123] is a regex that only matches file named abc1, abc2 and abc3. > * Upon investigation this is due to how we > [getBatch|https://github.com/databricks/runtime/blob/3af402d23620a0952e151d96c3184d2233217c87/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala#L269] > in the FileStreamSource. In `FileStreamSource` we already check file pattern > matching and find all match file names. However, in DataSource we check for > glob characters again and try to expend it > [here|https://github.com/databricks/runtime/blob/3af402d23620a0952e151d96c3184d2233217c87/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L274]. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org