[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-08-30 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-526704926
 
 
   @manishmalhotrawork, one strange thing about your test is that `data_bucket` 
has a different ID. It should continue to use id 1000 because it hasn't 
changed. Either the assignment logic or the evolution logic (like how 
`SchemaUpdate` works) should detect that the column has not changed and not 
assign a different ID.
   
   Without seeing more of what's happening, I'm not really able to tell why 
you're getting that error. Can you open a WIP pull request?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-08-27 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-525451795
 
 
   > And also as TableMetadata knows how many fields are in partition, so can 
maintain the nextIDValue as well.
   
   The next partition field ID is the highest field ID in all of the table's 
partition specs +1. Once a partition spec is removed, we can reuse the ID. 
Alternatively, we can keep track of the last assigned ID, like we do for the 
table schema.
   
   > Also the TableMetadata#updatePartitionSpec should also use nextIDValue to 
pass to PartitionSpec.
   
   I think the spec's IDs will be assigned by the time that method is called 
because the partition spec passed in is already created.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-08-26 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-524938283
 
 
   @manishmalhotrawork, those IDs have different contexts. The source ID in a 
partition field is the ID of the source data column in the table schema. The ID 
added by partitionType is the ID in the manifest file schema.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-08-24 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-524579846
 
 
   @manishmalhotrawork, we need to keep track of IDs that have been assigned to 
partition fields in a table and reuse them when partition specs change. They 
should probably continue to start at 1,000.
   
   @timmylicheng, Schema field IDs are the integers passed in when creating 
struct fields, maps, and lists. See http://iceberg.apache.org/api/#nested-types


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-07-25 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-515096818
 
 
   @timmylicheng, sorry for the confusion. The partition spec ID and the IDs 
I'm talking about here aren't the same thing. Partition specs are assigned IDs 
so that we can write manifest files that reference those specs, I think you're 
talking about those IDs.
   
   The IDs I'm talking about here are schema field IDs that are used to write 
the record of partition data in the manifest file. Right now, those IDs are 
assigned each time a manifest file is created, starting at 1000. Instead, we 
should use persistent IDs for each unique partition field and keep track of the 
last assigned ID for partition evolution.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] rdblue commented on issue #280: Add persistent IDs to partition fields

2019-07-24 Thread GitBox
rdblue commented on issue #280: Add persistent IDs to partition fields
URL: 
https://github.com/apache/incubator-iceberg/issues/280#issuecomment-514712278
 
 
   Yes. If a table has multiple partition specs, it will probably have multiple 
manifests, each written with one of those specs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org