Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/1125#discussion_r169857960
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
---
@@ -418,11 +438,13 @@ private void measureColumn(ValueVector v, String
prefix) {
netRowWidthCap50 += ! colSize.isVariableWidth ? colSize.estSize :
8 /* offset vector */ +
roundUpToPowerOf2(Math.min(colSize.estSize,50));
// above change 8 to 4 after DRILL-5446 is fixed
+
+ return colSize;
}
- private void expandMap(AbstractMapVector mapVector, String prefix) {
+ private void expandMap(ColumnSize colSize, AbstractMapVector mapVector,
String prefix) {
for (ValueVector vector : mapVector) {
- measureColumn(vector, prefix);
+ colSize.childColumnSizes.put(prefix + vector.getField().getName(),
measureColumn(vector, prefix));
--- End diff --
This is subject to aliasing. Suppose I have two maps:
```
aa(b)
a(ab)
```
When I add the child vectors, both will produce a combined name of `aab`.
We can't use dots n names for the same reason:
```
a.b(c)
a(b.c)
```
Both will produce `a.b.c`.
In the new "result set loader" code, all places that handle trees of
columns use actual trees of maps.
A crude-but-effecive solution is to use a non-legal name character. The
only valid one is the back-tick since we use that in SQL to quote names. If we
do that, we now have
```
aa`b
a`ab
a.b`c
a`b.c
```
And the names are now un-aliased.
---