Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20437#discussion_r164976661
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
---
@@ -157,7 +157,7 @@ class FileInputDStream[K, V, F <: NewInputFormat[K, V]](
val metadata = Map(
"files" -> newFiles.toList,
StreamInputInfo.METADATA_KEY_DESCRIPTION -> newFiles.mkString("\n"))
- val inputInfo = StreamInputInfo(id, 0, metadata)
+ val inputInfo = StreamInputInfo(id, rdds.map(_.count).sum, metadata)
--- End diff --
I'm not sure if there's a solution to fix it here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]