[GitHub] [iceberg] szehon-ho commented on pull request #4662: Core: Metadata table queries fail if a partition column was reused in V2

GitBox Fri, 29 Apr 2022 19:07:18 -0700


szehon-ho commented on PR #4662:
URL: https://github.com/apache/iceberg/pull/4662#issuecomment-1113891669


   I still need to ramp a bit on this, but I thought in that discussion we are 
going to collapse all of the same partition fields (if some are re-used) and 
they are uniquely identified with specid?  
   
   
   example for reference:
   ```
   PartitionSpec initialSpec = PartitionSpec.builderFor(SCHEMA)
       .identity("data")
       .build();
   TestTables.TestTable table = TestTables.create(tableDir, "test", SCHEMA, 
initialSpec, V2_FORMAT_VERSION);
   
   table.updateSpec()
       .removeField("data")
       .commit();
   
   table.updateSpec()
       .addField("data")
       .addField("id")
       .commit();
   
   struct<1000: data: optional string, 1001: data: optional string, 1002: id: 
optional int>
   ```
   
   I am thinking the code does try to re-use old partition values in both 
metadata (ManifestEvaluator) and data filtering, right?  (Let me know if 
assumption is incorrect).  Example here, we would hopefully use  "data" 
partition values written by the table's initial spec, for queries after 
re-adding "data" to the spec.   So then may make sense to collapse the fields 
together in metadata table?  Welcome any thoughts from @rdblue , @aokolnychyi 
who have more background in the initial discussion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] szehon-ho commented on pull request #4662: Core: Metadata table queries fail if a partition column was reused in V2

Reply via email to