Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/22880#discussion_r229451788
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
---
@@ -202,11 +204,15 @@ private[parquet] class ParquetRowConverter(
override def start(): Unit = {
var i = 0
- while (i < currentRow.numFields) {
+ while (i < fieldConverters.length) {
fieldConverters(i).updater.start()
currentRow.setNullAt(i)
--- End diff --
That is correct. Now that we're passing a Parquet schema that's a
(non-strict) subset of the Catalyst schema, we cannot assume that their fields
are in 1:1 correspondence.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]