[GitHub] drill issue #1106: DRILL-6129: Fixed query failure due to nested column data...

sachouche Thu, 01 Feb 2018 09:24:16 -0800

Github user sachouche commented on the issue:

    https://github.com/apache/drill/pull/1106
  
    Thanks @paul-rogers for the information; I went through the PR and noticed 
the following:
    
    - BatchSchema invokes MaterializedField.isEquivalent()
    - With my fix, both methods consider nested columns but they have several 
differences
    
    1) RecordBatchLoader requires sameness as this information is used to reuse 
the value vectors; if old and new batch are deemed same, then the value vectors 
are reloaded using the load(...) API. The metadata better be the same or a 
runtime exception will occur
    
    2) RecordBatchLoader isSame(...) API compares two different java objects: 
SerializedField (obtained from protobufs) and already materialized value 
vectors MaterializedField
    
    3) RecordBatchLoader isSame(...) API tolerates unordered fields (within the 
same level) but not MaterializedField.isEquivalent() method
    
    4) MaterializedField.isEquivalent() ignores hidden columns such "$bits" and 
"$offsets" but not RecordBatchLoader isSame(...)
    
    I think moving forward, the best way to prevent bugs with regard to schema 
changes is by maintaining a document that establishes all the rules. This will 
allow QA to refine their tests and catch current limitations.

---

[GitHub] drill issue #1106: DRILL-6129: Fixed query failure due to nested column data...

Reply via email to