Github user sachouche commented on the issue:
https://github.com/apache/drill/pull/1106
Thanks @paul-rogers for the information; I went through the PR and noticed
the following:
- BatchSchema invokes MaterializedField.isEquivalent()
- With my fix, both methods consider nested columns but they have several
differences
1) RecordBatchLoader requires sameness as this information is used to reuse
the value vectors; if old and new batch are deemed same, then the value vectors
are reloaded using the load(...) API. The metadata better be the same or a
runtime exception will occur
2) RecordBatchLoader isSame(...) API compares two different java objects:
SerializedField (obtained from protobufs) and already materialized value
vectors MaterializedField
3) RecordBatchLoader isSame(...) API tolerates unordered fields (within the
same level) but not MaterializedField.isEquivalent() method
4) MaterializedField.isEquivalent() ignores hidden columns such "$bits" and
"$offsets" but not RecordBatchLoader isSame(...)
I think moving forward, the best way to prevent bugs with regard to schema
changes is by maintaining a document that establishes all the rules. This will
allow QA to refine their tests and catch current limitations.
---