Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21669#discussion_r223513489
--- Diff: examples/src/main/scala/org/apache/spark/examples/HdfsTest.scala
---
@@ -41,6 +41,8 @@ object HdfsTest {
val end = System.currentTimeMillis()
println(s"Iteration $iter took ${end-start} ms")
}
+ println(s"File contents:
${file.map(_.toString).collect().mkString(",")}")
--- End diff --
I actually ran this and this output is really noisy. You're also
concatenating the contents of the whole input directory into a really long
string in memory.
If you want to print something I'd somehow limit the amount of data to be
shown here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]