[ 
https://issues.apache.org/jira/browse/ORC-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994251#comment-15994251
 ] 

ASF GitHub Bot commented on ORC-185:
------------------------------------

Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/116#discussion_r114469297
  
    --- Diff: c++/src/Statistics.hh ---
    @@ -41,49 +41,181 @@ namespace orc {
       };
     
     /**
    + * Internal Statistics Implementation
    + */
    +
    +  template <typename T>
    +  class InternalStatisticsImpl {
    +  private:
    +    bool hasNull_;
    +    bool hasMinimum_;
    +    bool hasMaximum_;
    +    bool hasSum_;
    +    bool hasTotalLength_;
    +    uint64_t totalLength_;
    +    uint64_t valueCount_;
    +    T minimum_;
    +    T maximum_;
    +    T sum_;
    +  public:
    +    InternalStatisticsImpl() {
    +      hasNull_ = false;
    +      hasMinimum_ = false;
    +      hasMaximum_ = false;
    +      hasSum_ = false;
    +      hasTotalLength_ = false;
    +      totalLength_ = -1;
    --- End diff --
    
    If I add a function called void update(std::string str) for 
StringColumnStatistics to update string stats and it will see problem. For the 
first string, it needs to change totalLength_ to its length. For the following 
strings we use addition. This works but the code is not elegant. 
    
    Similarly, if I add a function called void increase(uint64_t count), the 
same thing happens. I think making default value to 0 is more cleaner in these 
cases.


> [C++] Simplify Statististics Implementation
> -------------------------------------------
>
>                 Key: ORC-185
>                 URL: https://issues.apache.org/jira/browse/ORC-185
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Deepak Majeti
>            Assignee: Deepak Majeti
>
> There is a lot of code duplication in the current ColumnStatistics 
> implementation. The scope of this JIRA is to use templates to reuse code as 
> much as possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to