rdsr commented on a change in pull request #203: Support multiple partitions 
derived from the same field
URL: https://github.com/apache/incubator-iceberg/pull/203#discussion_r291312429
 
 

 ##########
 File path: api/src/main/java/org/apache/iceberg/expressions/Projections.java
 ##########
 @@ -205,13 +206,25 @@ private InclusiveProjection(PartitionSpec spec, boolean 
caseSensitive) {
     @Override
     @SuppressWarnings("unchecked")
     public <T> Expression predicate(BoundPredicate<T> pred) {
-      PartitionField part = spec().getFieldBySourceId(pred.ref().fieldId());
-      if (part == null) {
+      Collection<PartitionField> parts = 
spec().getFieldsBySourceId(pred.ref().fieldId());
+      if (parts == null) {
         // the predicate has no partition column
         return alwaysTrue();
       }
 
-      UnboundPredicate<?> result = ((Transform<T, ?>) 
part.transform()).project(part.name(), pred);
+      Expression result = Expressions.alwaysTrue();
+      for (PartitionField part : parts) {
+        // consider (d = 2019-01-01) with bucket(7, d) and bucket(5, d)
+        // projections: b1 = bucket(7, '2019-01-01') = 5, b2 = bucket(5, 
'2019-01-01') = 0
+        // any value where b1 != 5 or any value where b2 != 0 cannot be the 
'2019-01-01'
+        //
+        // similarly, if partitioning by date(ts) and date_hour(ts), the more 
restrictive
+        // projection should be used. ts = 2019-01-01T01:00:00 produces 
date=2019-01-01 and
+        // hour=2019-01-01-01. the value can will be in 2019-01-01-01 and not 
in 2019-01-01-02.
+        result = Expressions.and(
+            result,
+            ((Transform<T, ?>) part.transform()).project(part.name(), pred));
+      }
 
       if (result != null) {
 
 Review comment:
   Is a null check still valid?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to