mapleFU commented on PR #35989:
URL: https://github.com/apache/arrow/pull/35989#issuecomment-1581985853

   Another problem is that:
   
   ```c++
   template <typename DType>
   static std::shared_ptr<Statistics> MakeTypedColumnStats(
       const format::ColumnMetaData& metadata, const ColumnDescriptor* descr) {
     // If ColumnOrder is defined, return max_value and min_value
     if (descr->column_order().get_order() == ColumnOrder::TYPE_DEFINED_ORDER) {
       return MakeStatistics<DType>(
           descr, metadata.statistics.min_value, metadata.statistics.max_value,
           metadata.num_values - metadata.statistics.null_count,
           metadata.statistics.null_count, metadata.statistics.distinct_count,
           metadata.statistics.__isset.max_value || 
metadata.statistics.__isset.min_value,
           metadata.statistics.__isset.null_count,
           metadata.statistics.__isset.distinct_count);
     }
     // Default behavior
     return MakeStatistics<DType>(
         descr, metadata.statistics.min, metadata.statistics.max,
         metadata.num_values - metadata.statistics.null_count,
         metadata.statistics.null_count, metadata.statistics.distinct_count,
         metadata.statistics.__isset.max || metadata.statistics.__isset.min,
         metadata.statistics.__isset.null_count, 
metadata.statistics.__isset.distinct_count);
   }
   ```
   
   For the Arrow write file, this is ok, however, when 
`!metadata.statistics.__isset.null_count`, the `num_values` would be wrong. 
Currently I don't know how to fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to