Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/22880#discussion_r231243760
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
---
@@ -202,11 +204,15 @@ private[parquet] class ParquetRowConverter(
override def start(): Unit = {
var i = 0
- while (i < currentRow.numFields) {
+ while (i < fieldConverters.length) {
fieldConverters(i).updater.start()
currentRow.setNullAt(i)
--- End diff --
Thank you both for your feedback.
> Seems It can save some redundant iterations.
That was my motivation in writing the code this way. While the code is not
as clear as it could be, it is very performance critical.
I'm going to push a new commit keeping the current code but with a brief
explanatory comment.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]