[ 
https://issues.apache.org/jira/browse/DRILL-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297684#comment-16297684
 ] 

Paul Rogers commented on DRILL-6046:
------------------------------------

Suggested improvements. First, to ensure that the metadata tree remains 
consistent:

* The materialized field passed to the constructor is the one used for the 
vector.
* The materialized field created for the vector is final, it can change but 
cannot be replaced.

To ensure consistent vector creation:

* Every vector constructor should build itself as defined by the passed-in 
materialized field.

To avoid clutter:

* Every vector includes its internal fields and public child fields in the list 
of children.
* Add a field to mark a materialized field as private. Private fields are not 
compared when checking if two fields are equal. Private fields are ignored when 
building a new vector.
* Provide a method on materialized field to get the public schema (without 
internal vectors).

> Define semantics of vector metadata
> -----------------------------------
>
>                 Key: DRILL-6046
>                 URL: https://issues.apache.org/jira/browse/DRILL-6046
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> Vectors provide metadata in the form of the {{MaterializedField}}. This class 
> has evolved in an ad-hoc fashion over time, resulting in inconsistent 
> behavior across vectors. The inconsistent behavior causes bugs and slow 
> development because each vector follows different rules. Consistent behavior 
> would, by contrast, lead to faster development and fewer bugs by reducing the 
> number of variations that code must handle.
> Issues include:
> * Map vectors, but not lists, can create contents given a list of children in 
> the {{MaterializedField}} passed to the constructor.
> * {{MaterializedField}} appears to want to be immutable, but it does allow 
> changing of children. Unions also want to change the list of subtypes, but 
> that is in the immutable {{MajorType}}, causing unions to rebuild and replace 
> its {{MaterializedField}} on addition of a new type. By contrast, maps do not 
> replace the field, they just add children.
> * Container vectors (maps, unions, lists) hold references to child 
> {{MaterializedFields}}. But, because unions replace their fields, parents 
> become out of sync since they point to the old, version before the update, 
> causing inconsistent metadata, so that code cannot trust the metadata.
> * Lists and maps, but not unions, list their children in the field.
> * Nullable types, but not repeated types, include internal vectors in their 
> list of children. 
> * When creating a map, as discussed above, the map creates children based on 
> the field. But, the constructor clones the field so that the actual field in 
> the map is not the one passed in. As a result, a parent vector, which holds a 
> child map, points to the original map field, not the cloned one, leading to 
> inconsistency if the child map later adds more fields.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to