[GitHub] incubator-trafodion pull request: [TRAFODION-1618] Fix row estimat...

selvaganesang Fri, 18 Dec 2015 08:16:21 -0800

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/229#discussion_r48040857
  
    --- Diff: core/sql/src/main/java/org/trafodion/sql/HBaseClient.java ---
    @@ -1088,36 +1139,65 @@ public boolean estimateRowCount(String tblName, int 
partialRowSize,
               //printQualifiers(reader, 100);
               if (ROWS_TO_SAMPLE > 0 &&
                   totalEntries == reader.getEntries()) {  // first file only
    -            // Trafodion column qualifiers are ordinal numbers, which
    -            // makes it easy to count missing (null) values. We also count
    -            // the non-Put KVs (typically delete-row markers) to estimate
    -            // their frequency in the full file set.
    +
    +            // Trafodion column qualifiers are ordinal numbers, but are 
represented
    +            // as varying length unsigned little-endian integers in 
lexicographical
    +            // order. So, for example, in a table with 260 columns, the 
column
    +            // qualifiers (if present) will be read in this order: 
    +            // 1 (x'01'), 257 (x'0101'), 2 (x'02'), 258 (x'0201'), 3 
(x'03'),
    +            // 259 (x'0301'), 4 (x'04'), 260 (x'0401'), 5 (x'05'), 6 
(x'06'), 
    +            // 7 (x'07'), ...
    +            // We have crossed the boundary to the next row if and only if 
the
    +            // next qualifier read is less than or equal to the previous, 
    +            // compared unsigned, lexicographically.
    +
    --- End diff --
    
    I wonder if it is possible to estimate based on the getEntries(), the 
number of columns in the table and the number of default value/nul columns.  We 
also have avgKeyLen, avgValueLen and fileSize that might aid in row estimation.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1618] Fix row estimat...

Reply via email to