Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21814#discussion_r203638199
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -384,12 +385,10 @@ class ParquetFileFormat
// *only* if the file was created by something other than
"parquet-mr", so check the actual
// writer here for this file. We have to do this per-file, as each
file in the table may
// have different writers.
- def isCreatedByParquetMr(): Boolean = {
- val footer = ParquetFileReader.readFooter(sharedConf, filePath,
SKIP_ROW_GROUPS)
- footer.getFileMetaData().getCreatedBy().startsWith("parquet-mr")
- }
+ val isCreatedByParquetMr =
footerFileMetaData.getCreatedBy().startsWith("parquet-mr")
--- End diff --
Hm? `isCreatedByParquetMr` will be evaluated here. We should make
`isCreatedByParquetMr` lazy too ..
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]