Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/orc/pull/292#discussion_r205340059
--- Diff: java/core/src/java/org/apache/orc/impl/ColumnStatisticsImpl.java
---
@@ -584,16 +642,40 @@ public void merge(ColumnStatisticsImpl other) {
if (str.minimum != null) {
maximum = new Text(str.getMaximum());
minimum = new Text(str.getMinimum());
- } else {
+ }
+ /* str.minimum == null when lower bound set */
+ else if (str.getLowerBound() != null) {
+ minimum = new Text(str.getLowerBound());
+ isLowerBoundSet = true;
+
+ /* check for upper bound before setting max */
+ if (str.getUpperBound() != null) {
+ maximum = new Text(str.getUpperBound());
+ isUpperBoundSet = true;
+ } else {
+ maximum = new Text(str.getMaximum());
+ }
+ }
+ else {
/* both are empty */
maximum = minimum = null;
}
} else if (str.minimum != null) {
if (minimum.compareTo(str.minimum) > 0) {
- minimum = new Text(str.getMinimum());
+ if(str.getLowerBound() != null) {
+ minimum = new Text(str.getLowerBound());
+ isLowerBoundSet = true;
+ } else {
+ minimum = new Text(str.getMinimum());
--- End diff --
If we have found a new minimum w/o truncation, should we reset
isLowerBoundSet to false?
Same for isUpperBoundSet below.
---