[ https://issues.apache.org/jira/browse/SPARK-29454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan reassigned SPARK-29454: ----------------------------------- Assignee: Yang Jie > Reduce unsafeProjection call times when read parquet file > --------------------------------------------------------- > > Key: SPARK-29454 > URL: https://issues.apache.org/jira/browse/SPARK-29454 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.2.3, 2.3.4, 2.4.4 > Reporter: Yang Jie > Assignee: Yang Jie > Priority: Major > > ParquetGroupConverter call unsafeProjection function to covert > SpecificInternalRow to UnsafeRow every times when read Parquet data file use > ParquetRecordReader, then ParquetFileFormat will call unsafeProjection > function to covert this UnsafeRow to another UnsafeRow again when > partitionSchema is not empty , and on the other hand > PartitionReaderWithPartitionValues always do this convert process when use > DataSourceV2. > I think the first time convert in ParquetGroupConverter is redundant and > ParquetRecordReader return a SpecificInternalRow is enough. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org