Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/116#discussion_r114446720
  
    --- Diff: c++/src/Statistics.hh ---
    @@ -41,49 +41,181 @@ namespace orc {
       };
     
     /**
    + * Internal Statistics Implementation
    + */
    +
    +  template <typename T>
    --- End diff --
    
    We may need some functions like void increase(uint64 count) to increase 
valueCount. I can add them when needed. 
    My main concern for using templates is that we need to compare, update, 
merge ColumnStatistics, and transform to protobuf version for implementing 
writers and using templates will also introduce some duplicate code. It means 
we still need to do template specialization for different types like Date, 
Timestamp, Decimal, etc. if we want to let class ColumnStatistics to handle the 
update (e.g. use ColumnStatistics<T>::update(T value) to update min/max for 
type T). Otherwise we may need to let specific ColumnWriters to be responsible 
for update (e.g. DecimalColumnWriter to compare min/max of decimal values and 
then use setMax/setMin of ColumnStatistics<Decimal> to update the values).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to