[
https://issues.apache.org/jira/browse/DRILL-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297684#comment-16297684
]
Paul Rogers commented on DRILL-6046:
------------------------------------
Suggested improvements. First, to ensure that the metadata tree remains
consistent:
* The materialized field passed to the constructor is the one used for the
vector.
* The materialized field created for the vector is final, it can change but
cannot be replaced.
To ensure consistent vector creation:
* Every vector constructor should build itself as defined by the passed-in
materialized field.
To avoid clutter:
* Every vector includes its internal fields and public child fields in the list
of children.
* Add a field to mark a materialized field as private. Private fields are not
compared when checking if two fields are equal. Private fields are ignored when
building a new vector.
* Provide a method on materialized field to get the public schema (without
internal vectors).
> Define semantics of vector metadata
> -----------------------------------
>
> Key: DRILL-6046
> URL: https://issues.apache.org/jira/browse/DRILL-6046
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
> Priority: Minor
>
> Vectors provide metadata in the form of the {{MaterializedField}}. This class
> has evolved in an ad-hoc fashion over time, resulting in inconsistent
> behavior across vectors. The inconsistent behavior causes bugs and slow
> development because each vector follows different rules. Consistent behavior
> would, by contrast, lead to faster development and fewer bugs by reducing the
> number of variations that code must handle.
> Issues include:
> * Map vectors, but not lists, can create contents given a list of children in
> the {{MaterializedField}} passed to the constructor.
> * {{MaterializedField}} appears to want to be immutable, but it does allow
> changing of children. Unions also want to change the list of subtypes, but
> that is in the immutable {{MajorType}}, causing unions to rebuild and replace
> its {{MaterializedField}} on addition of a new type. By contrast, maps do not
> replace the field, they just add children.
> * Container vectors (maps, unions, lists) hold references to child
> {{MaterializedFields}}. But, because unions replace their fields, parents
> become out of sync since they point to the old, version before the update,
> causing inconsistent metadata, so that code cannot trust the metadata.
> * Lists and maps, but not unions, list their children in the field.
> * Nullable types, but not repeated types, include internal vectors in their
> list of children.
> * When creating a map, as discussed above, the map creates children based on
> the field. But, the constructor clones the field so that the actual field in
> the map is not the one passed in. As a result, a parent vector, which holds a
> child map, points to the original map field, not the cloned one, leading to
> inconsistency if the child map later adds more fields.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)