dongjoon-hyun commented on a change in pull request #26804:
URL: https://github.com/apache/spark/pull/26804#discussion_r561433379



##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
##########
@@ -759,7 +759,7 @@ class ParquetSchemaSuite extends ParquetSchemaTest {
         nullable = true))),
     """message root {
       |  optional group f1 (MAP) {
-      |    repeated group map (MAP_KEY_VALUE) {
+      |    repeated group key_value (MAP_KEY_VALUE) {

Review comment:
       So, are you saying that there is no breaking change, @wangyum ?
   @srowen 's question is asking the reason why we need this change, isn't it?
   

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
##########
@@ -127,6 +127,9 @@ class ParquetFileFormat
       conf.setEnum(ParquetOutputFormat.JOB_SUMMARY_LEVEL, JobSummaryLevel.NONE)
     }
 
+    // PARQUET-1580: Disables page-level CRC checksums by default.

Review comment:
       Could you add some comment about the reason why you disable it? It looks 
like a workaround to avoid Parquet-side performance regression.

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
##########
@@ -127,6 +127,9 @@ class ParquetFileFormat
       conf.setEnum(ParquetOutputFormat.JOB_SUMMARY_LEVEL, JobSummaryLevel.NONE)
     }
 
+    // PARQUET-1580: Disables page-level CRC checksums by default.

Review comment:
       Wow. Then, it's a real bug. Thanks for confirmation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to