Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/20619#discussion_r168929352
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -395,16 +395,21 @@ class ParquetFileFormat
ParquetInputFormat.setFilterPredicate(hadoopAttemptContext.getConfiguration,
pushed.get)
}
val taskContext = Option(TaskContext.get())
- val parquetReader = if (enableVectorizedReader) {
+ if (enableVectorizedReader) {
val vectorizedReader = new VectorizedParquetRecordReader(
convertTz.orNull, enableOffHeapColumnVector &&
taskContext.isDefined, capacity)
+ val iter = new RecordReaderIterator(vectorizedReader)
+ // SPARK-23457 Register a task completion lister before
`initialization`.
--- End diff --
Now, `SPARK-23457` is added.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]