Yang Jie created SPARK-29454:
--------------------------------
Summary: Reduce one unsafeProjection call when read parquet file
Key: SPARK-29454
URL: https://issues.apache.org/jira/browse/SPARK-29454
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.4.4, 2.3.4, 2.2.3
Reporter: Yang Jie
ParquetGroupConverter call unsafeProjection function to covert
SpecificInternalRow to UnsafeRow every times when read Parquet data file use
ParquetRecordReader, then ParquetFileFormat will call unsafeProjection function
to covert this UnsafeRow to another UnsafeRow again when partitionSchema is not
empty , and on the other hand we PartitionReaderWithPartitionValues always do
this convert process when use DataSourceV2.
I think the first time convert in ParquetGroupConverter is redundant and
ParquetRecordReader return a SpecificInternalRow is enough.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]