wgtmac commented on code in PR #49909:
URL: https://github.com/apache/arrow/pull/49909#discussion_r3207472080


##########
cpp/src/parquet/metadata.cc:
##########
@@ -307,6 +307,20 @@ class ColumnChunkMetaData::ColumnChunkMetaDataImpl {
     possible_encoded_stats_ = nullptr;
     possible_geo_stats_ = nullptr;
     InitKeyValueMetadata();
+
+    // A column with max_definition_level == 0 cannot contain null values, so
+    // a non-zero null_count in its statistics is contradictory. Drop the bad
+    // value so the file remains readable.
+    if (descr_->max_definition_level() == 0 && 
column_metadata_->__isset.statistics &&
+        column_metadata_->statistics.__isset.null_count &&
+        column_metadata_->statistics.null_count > 0) {
+      if (column_metadata_ != &decrypted_metadata_) {

Review Comment:
   Do we really need this check? A copy is alway performed in an unencrypted 
case. Is it better to remove them?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to