rdblue commented on a change in pull request #203: Support multiple partitions 
derived from the same field
URL: https://github.com/apache/incubator-iceberg/pull/203#discussion_r292596202
 
 

 ##########
 File path: api/src/main/java/org/apache/iceberg/expressions/Projections.java
 ##########
 @@ -205,13 +206,25 @@ private InclusiveProjection(PartitionSpec spec, boolean 
caseSensitive) {
     @Override
     @SuppressWarnings("unchecked")
     public <T> Expression predicate(BoundPredicate<T> pred) {
-      PartitionField part = spec().getFieldBySourceId(pred.ref().fieldId());
-      if (part == null) {
+      Collection<PartitionField> parts = 
spec().getFieldsBySourceId(pred.ref().fieldId());
+      if (parts == null) {
         // the predicate has no partition column
         return alwaysTrue();
       }
 
-      UnboundPredicate<?> result = ((Transform<T, ?>) 
part.transform()).project(part.name(), pred);
+      Expression result = Expressions.alwaysTrue();
+      for (PartitionField part : parts) {
+        // consider (d = 2019-01-01) with bucket(7, d) and bucket(5, d)
+        // projections: b1 = bucket(7, '2019-01-01') = 5, b2 = bucket(5, 
'2019-01-01') = 0
+        // any value where b1 != 5 or any value where b2 != 0 cannot be the 
'2019-01-01'
+        //
+        // similarly, if partitioning by date(ts) and date_hour(ts), the more 
restrictive
+        // projection should be used. ts = 2019-01-01T01:00:00 produces 
date=2019-01-01 and
+        // hour=2019-01-01-01. the value can will be in 2019-01-01-01 and not 
in 2019-01-01-02.
 
 Review comment:
   Fixed the typo.
   
   This case is actually not allowed because it uses two time partition 
functions. This is really to document the rationale behind using `and` instead 
of `or`. Having a unit test for it doesn't make much sense because it would 
only test that this is using what it says it will use.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to