LuciferYang opened a new pull request, #37115: URL: https://github.com/apache/spark/pull/37115
### What changes were proposed in this pull request? The change of this pr is add `vector.setIsConstant()` when missing column with defaultValue and `vector.appendObjects(capacity, defaultValue).isPresent()` is true during `ParquetColumnVector` initialization. ### Why are the changes needed? This is just a minor improvement, for the missing column with default value, setting isConstant to true can will prevent the `reset()` method from restoring the internal state of `WritableColumnVector`. `OrcColumnarBatchReader` has done similar things to missing column. https://github.com/LuciferYang/spark/blob/57c82ea1e561cbfad4cbc7fd3880036b0bb39ab6/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java#L176-L191 Without this change, there will be no bug, because missing column will only be initialized once and the corresponding columnReader is null, the reset() method will only reset `.WritableColumnVector#elementsAppended` to 0, but this will not affect anything. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
