Siying Dong created SPARK-51492:
-----------------------------------
Summary: FileStreamSource: Avoid expensive file concatenation if
trace level is not enabled.
Key: SPARK-51492
URL: https://issues.apache.org/jira/browse/SPARK-51492
Project: Spark
Issue Type: Task
Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Siying Dong
In this statement:
[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala#L402]
files.mkString("\n\t") can be really expensive if there are many files, even if
they are not to be processed by this batch, and the trace level is not enabled.
We should not call this expensive operation unless the log level is enabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]