paul-rogers commented on a change in pull request #1383: DRILL-6613: Refactor 
MaterializedField
URL: https://github.com/apache/drill/pull/1383#discussion_r204166252
 
 

 ##########
 File path: 
exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java
 ##########
 @@ -49,39 +54,79 @@ private MaterializedField(String name, MajorType type, 
LinkedHashSet<Materialize
     this.children = children;
   }
 
+  private MaterializedField(String name, MajorType type, int size) {
+    this(name, type, new LinkedHashSet<>(size));
+  }
+
+  private <T> void copyFrom(Collection<T> source, Function<T, 
MaterializedField> transformation) {
+    Preconditions.checkState(children.isEmpty());
+    source.forEach(child -> children.add(transformation.apply(child)));
+  }
+
+  public static MaterializedField create(String name, MajorType type) {
+    return new MaterializedField(name, type, 0);
+  }
+
   public static MaterializedField create(SerializedField serField) {
-    LinkedHashSet<MaterializedField> children = new LinkedHashSet<>();
-    for (SerializedField sf : serField.getChildList()) {
-      children.add(MaterializedField.create(sf));
+    MaterializedField field = new 
MaterializedField(serField.getNamePart().getName(), serField.getMajorType(), 
serField.getChildCount());
+    if (OFFSETS_FIELD.equals(field)) {
+      return OFFSETS_FIELD;
     }
-    return new MaterializedField(serField.getNamePart().getName(), 
serField.getMajorType(), children);
+    field.copyFrom(serField.getChildList(), MaterializedField::create);
+    return field;
   }
 
-  /**
-   * Create and return a serialized field based on the current state.
-   */
-  public SerializedField getSerializedField() {
-    SerializedField.Builder serializedFieldBuilder = getAsBuilder();
-    for(MaterializedField childMaterializedField : getChildren()) {
-      
serializedFieldBuilder.addChild(childMaterializedField.getSerializedField());
+  public MaterializedField copy() {
+    return copy(getName(), getType());
+  }
+
+  public MaterializedField copy(MajorType type) {
+    return copy(name, type);
+  }
+
+  public MaterializedField copy(String name) {
+    return copy(name, getType());
+  }
+
+  public MaterializedField copy(String name, final MajorType type) {
+    if (this == OFFSETS_FIELD) {
+      return this;
     }
-    return serializedFieldBuilder.build();
+    MaterializedField field = new MaterializedField(name, type, 
getChildren().size());
+    field.copyFrom(getChildren(), MaterializedField::copy);
 
 Review comment:
   My point is not how things are implemented. The point is: we almost never 
want to copy children when copying a `MaterializedField`. Why? Because the 
consumer of that copy may be code that creates a new vector. When it does, it 
will add child vectors (copies of the source vectors), and that action will add 
new child fields.
   
   This is why I said that "copy" is misleading: we need to understand the 
context in which we make the copy, and possibly name the method accordingly: 
"copyForNewVector", "fullCopy", "copyIfNeeded", and so on. All of these should 
have comments that explain the use case that they serve.
   
   Sorry that this is so complex; it is just the way the code has evolved. As I 
recently found, it is quite hard to change entrenched code behavior, even when 
it is not quite right.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to