rdblue opened a new pull request #627: Update indexing to handle nested lists
URL: https://github.com/apache/incubator-iceberg/pull/627
 
 
   This updates how `Schema` fields are indexed.
   
   Previously, the visit method maintained a list of parent field names in each 
visitor, and the `IndexByName` visitor used these fields to create qualified 
field names. But the visitor did not add "element", "key", and "value" parent 
names, which resulted in duplicate names when indexing, for example, a list of 
lists.
   
   The purpose of omitting "element", "key", and "value" from parent field 
names was to avoid forcing users to handle unnamed structures. For example, 
leaving out "element" from names for `points: struct<x double, y double>` 
results in fields "points.x" and "points.y" (and the nested struct as 
"points.element") instead of "points.element.x" and "points.element.y".
   
   This updates how element and value names are skipped, by only skipping the 
names if a map value or list element is a nested struct. That way, a list of 
lists will correctly add an "element" level when processing the outer list's 
element. This also updates indexing so that "key" is always used so that key 
and value fields will not conflict.
   
   This also moves the parent field names into `IndexByName` because they are 
only used in that class.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to