imply-cheddar commented on code in PR #13653:
URL: https://github.com/apache/druid/pull/13653#discussion_r1066465568
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -151,7 +177,79 @@ public ColumnValueSelector<?> makeColumnValueSelector(
IncrementalIndex.DimensionDesc desc
)
{
- final int dimIndex = desc.getIndex();
+
+ if (fieldIndexers.size() == 1 &&
fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT)) {
+ final LiteralFieldIndexer rootField =
fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT);
+ if (rootField.getTypes().getSingleType() != null) {
+ return new ColumnValueSelector<Object>()
+ {
+ @Override
+ public boolean isNull()
+ {
+ final Object o = getObject();
+ return !(o instanceof Number);
Review Comment:
Numbers cannot be null? Or... what's the logic of this `instanceof` check?
I had expected just a normal `o == null`?
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -177,6 +281,21 @@ public Class<StructuredData> classOfObject()
@Override
public ColumnCapabilities getColumnCapabilities()
+ {
+ if (fieldIndexers.size() == 1 &&
fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT)) {
+ LiteralFieldIndexer rootField =
fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT);
+ if (rootField.getTypes().getSingleType() != null) {
+ return
ColumnCapabilitiesImpl.createSimpleNumericColumnCapabilities(rootField.getTypes().getSingleType())
+ .setHasNulls(hasNulls);
Review Comment:
This seems weird, the code seems like it's returning a numeric type, but
it's actually using whatever type was found, which could be a String. Looking
at the actual semantics of what is returned by that static method, it seems
like it's really just building a capabilities that is "single type, just the
values, no dictionaries, no frills". Can we rename that static method or add a
new one that is more appropriately named?
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -142,6 +142,32 @@ public DimensionSelector makeDimensionSelector(
IncrementalIndex.DimensionDesc desc
)
{
+ if (
+ fieldIndexers.size() == 1 &&
+ fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT) &&
+
fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT).getTypes().getSingleType()
!= null
+ ) {
+ final ColumnValueSelector delegate = makeColumnValueSelector(currEntry,
desc);
Review Comment:
The call to `makeColumnValueSelector` does the exact same validations as the
if statement above here. Instead of reusing that function, how about creating
a private function that can be invoked assuming that these things are true and
use that instead?
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -216,6 +335,54 @@ public Object
convertUnsortedEncodedKeyComponentToActualList(StructuredData key)
@Override
public ColumnValueSelector convertUnsortedValuesToSorted(ColumnValueSelector
selectorWithUnsortedValues)
{
Review Comment:
Looking at this code, I'm unclear as to why this is the correct behavior?
Can you add comments that explain why we are returning the unsorted values even
though the method is clearly asking for the sorted ones?
I *think* it's because we are actually side-stepping what the overall
indexing stuff wants to do and just doing the sort on our own at persist time,
but it would be great to have some comments explaining that.
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -216,6 +335,54 @@ public Object
convertUnsortedEncodedKeyComponentToActualList(StructuredData key)
@Override
public ColumnValueSelector convertUnsortedValuesToSorted(ColumnValueSelector
selectorWithUnsortedValues)
{
+ if (fieldIndexers.size() == 1 &&
fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT)) {
+ // do the opposite here...
Review Comment:
This is not enough words to explain what you mean. Please add more words or
remove the comment.
##########
processing/src/main/java/org/apache/druid/segment/column/ColumnHolder.java:
##########
@@ -39,6 +39,11 @@ static boolean storeDoubleAsFloat()
ColumnCapabilities getCapabilities();
+ default ColumnCapabilities getHandlerCapabilities()
+ {
+ return getCapabilities();
+ }
+
Review Comment:
Why default?
##########
processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java:
##########
@@ -480,6 +485,20 @@ public ColumnCapabilities getColumnCapabilities(String
columnName)
return null;
}
+ @Nullable
+ public ColumnCapabilities getColumnHandlerCapabilities(String columnName)
+ {
+ if (timeAndMetricsColumnCapabilities.containsKey(columnName)) {
+ return timeAndMetricsColumnCapabilities.get(columnName);
+ }
+ synchronized (dimensionDescs) {
+ if (dimensionDescs.containsKey(columnName)) {
+ return
dimensionDescs.get(columnName).getIndexer().getHandlerCapabilities();
+ }
+ }
Review Comment:
Why the `containsKey` then `get`? Why not just `get()` and check for null?
`containsKey` is doing the exact same work that `get` does, so this is just
duplicating work.
##########
processing/src/main/java/org/apache/druid/segment/nested/NestedDataComplexTypeSerde.java:
##########
@@ -88,8 +88,23 @@ public void deserializeColumn(
capabilitiesBuilder.setDictionaryEncoded(true);
capabilitiesBuilder.setDictionaryValuesSorted(true);
capabilitiesBuilder.setDictionaryValuesUnique(true);
- builder.setComplexTypeName(TYPE_NAME);
+ ColumnType simpleType = supplier.getSimpleType();
+ if (simpleType != null) {
+ builder.setType(simpleType);
+ } else {
+ builder.setComplexTypeName(TYPE_NAME);
+
+ }
builder.setComplexColumnSupplier(supplier);
+
+ // always use the nested column dimension hanler, regardless what we claim
our query time type is
Review Comment:
typo: `handler` instead of `hanler`
##########
processing/src/main/java/org/apache/druid/segment/incremental/AppendableIndexSpec.java:
##########
@@ -32,4 +32,7 @@
// Returns the default max bytes in memory for this index.
long getDefaultMaxBytesInMemory();
+
+ @SuppressWarnings("unused")
Review Comment:
Is this still unused? I thought I saw you pushing it into something?
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -151,7 +177,79 @@ public ColumnValueSelector<?> makeColumnValueSelector(
IncrementalIndex.DimensionDesc desc
)
{
- final int dimIndex = desc.getIndex();
+
+ if (fieldIndexers.size() == 1 &&
fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT)) {
+ final LiteralFieldIndexer rootField =
fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT);
+ if (rootField.getTypes().getSingleType() != null) {
+ return new ColumnValueSelector<Object>()
+ {
+ @Override
+ public boolean isNull()
+ {
+ final Object o = getObject();
+ return !(o instanceof Number);
+ }
+
+ @Override
+ public float getFloat()
+ {
+ Object value = getObject();
+ if (value == null) {
+ return 0;
+ }
+ return ((Number) value).floatValue();
+ }
+
+ @Override
+ public double getDouble()
+ {
+ Object value = getObject();
+ if (value == null) {
+ return 0;
+ }
+ return ((Number) value).doubleValue();
+ }
+
+ @Override
+ public long getLong()
+ {
+ Object value = getObject();
+ if (value == null) {
+ return 0;
+ }
+ return ((Number) value).longValue();
+ }
+
+ @Override
+ public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+ {
+
+ }
+
+ @Nullable
+ @Override
+ public Object getObject()
+ {
+ final int dimIndex = desc.getIndex();
Review Comment:
Does this `dimIndex` change across different rows? It looks like we are
calling it once-per-row, but it seems like it should be static?
##########
processing/src/main/java/org/apache/druid/segment/RowBasedColumnSelectorFactory.java:
##########
@@ -500,9 +501,13 @@ private Object getCurrentValue()
@Nullable
private Number getCurrentValueAsNumber()
{
+ final Object currentValue = getCurrentValue();
+ if (currentValue instanceof StructuredData) {
+ return Rows.objectToNumber(columnName, ((StructuredData)
currentValue).getValue(), throwParseExceptions);
+ }
return Rows.objectToNumber(
columnName,
- getCurrentValue(),
+ currentValue,
throwParseExceptions
);
Review Comment:
I cannot seem to leave a comment on old code so leaving it here as, but
reading this got me wondering.
The `isNull` on this class seems to be also forcing the object to be a
number or fail parsing. Why doesn't it do the null check on `getCurrentValue`
instead?
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnIndexer.java:
##########
@@ -142,6 +142,32 @@ public DimensionSelector makeDimensionSelector(
IncrementalIndex.DimensionDesc desc
)
{
+ if (
+ fieldIndexers.size() == 1 &&
+ fieldIndexers.containsKey(NestedPathFinder.JSON_PATH_ROOT) &&
+
fieldIndexers.get(NestedPathFinder.JSON_PATH_ROOT).getTypes().getSingleType()
!= null
+ ) {
+ final ColumnValueSelector delegate = makeColumnValueSelector(currEntry,
desc);
+ return new BaseSingleValueDimensionSelector()
+ {
+ @Nullable
+ @Override
+ protected String getValue()
+ {
+ final Object o = delegate.getObject();
+ if (o == null) {
+ return null;
+ }
+ return String.valueOf(o);
Review Comment:
Nit: Given that you've already done the null check, this is the same as
`o.toString()`.
##########
processing/src/main/java/org/apache/druid/segment/nested/NestedDataColumnSupplier.java:
##########
@@ -91,6 +95,15 @@ public NestedDataColumnSupplier(
fields = GenericIndexed.read(bb, GenericIndexed.STRING_STRATEGY,
mapper);
Review Comment:
This is a comment on old code and I think we'd have to push the version
forward yet again to adjust this, which might not be worth it, but it's a bit
sad to me that we aren't using the front-coded stuff for the fields. They are
pretty much guaranteed to benefit from it.
##########
processing/src/main/java/org/apache/druid/segment/nested/CompressedNestedDataComplexColumn.java:
##########
@@ -121,6 +125,7 @@ public CompressedNestedDataComplexColumn(
this.fileMapper = fileMapper;
this.closer = Closer.create();
this.compressedRawColumnSupplier = compressedRawColumnSupplier;
+ this.rootFieldPath = getField(Collections.emptyList());
Review Comment:
This is doing work in the constructor. On top of that, it's calling an
abstract method, meaning that it's going to call a method on a
not-yet-constructed child class (the child class is in the super call, it
shouldn't've had a chance to set any of its variables or anything yet).
If this is important for this parent class to have, it should be on the
constructor and the child should compute it and pass it in. That or it can be
lazily instantiated. I haven't looked closely enough to identify which it
should be. I'll try to come back with an opinion on that.
##########
processing/src/main/java/org/apache/druid/segment/nested/NestedDataColumnSupplier.java:
##########
@@ -91,6 +95,15 @@ public NestedDataColumnSupplier(
fields = GenericIndexed.read(bb, GenericIndexed.STRING_STRATEGY,
mapper);
fieldInfo = NestedLiteralTypeInfo.read(bb, fields.size());
+ if (fields.size() == 1 &&
+ ((version == 0x03 &&
NestedPathFinder.JQ_PATH_ROOT.equals(fields.get(0))) ||
+ (version == 0x04 &&
NestedPathFinder.JSON_PATH_ROOT.equals(fields.get(0))))
+ ) {
+ simpleType = fieldInfo.getTypes(0).getSingleType();
+ } else {
+ simpleType = null;
+ }
Review Comment:
Just generally speaking, all of this work is being done in a constructor.
It would be better to have it in a static method and just pass the needed
things into the constructor. I realize it's a nit comment on old code...
##########
processing/src/test/java/org/apache/druid/segment/NestedDataColumnIndexerTest.java:
##########
@@ -83,4 +126,459 @@ public void testStuff()
Assert.assertEquals(6, indexer.getCardinality());
}
}
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryRootString() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, STRING_COL, "a"));
+ index.add(makeInputRow(minTimestamp + 2, true, STRING_COL, "b"));
+ index.add(makeInputRow(minTimestamp + 3, true, STRING_COL, "c"));
+ index.add(makeInputRow(minTimestamp + 4, true, STRING_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, STRING_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new DefaultDimensionSpec(STRING_COL,
STRING_COL, ColumnType.STRING);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_COL);
+ DimensionSelector dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals("a", valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("a",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("a", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(STRING_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals("b", valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("b",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("b", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(STRING_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals("c", valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("c",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("c", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(STRING_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(STRING_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+ }
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryRootLong() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, LONG_COL, 1L));
+ index.add(makeInputRow(minTimestamp + 2, true, LONG_COL, 2L));
+ index.add(makeInputRow(minTimestamp + 3, true, LONG_COL, 3L));
+ index.add(makeInputRow(minTimestamp + 4, true, LONG_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, LONG_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new DefaultDimensionSpec(LONG_COL,
LONG_COL, ColumnType.LONG);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(LONG_COL);
+ DimensionSelector dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(1L, valueSelector.getObject());
+ Assert.assertEquals(1L, valueSelector.getLong());
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("1",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("1", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(LONG_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(2L, valueSelector.getObject());
+ Assert.assertEquals(2L, valueSelector.getLong());
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("2",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("2", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(LONG_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(3L, valueSelector.getObject());
+ Assert.assertEquals(3L, valueSelector.getLong());
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("3",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("3", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(LONG_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertTrue(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(LONG_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertTrue(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+ }
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryRootDouble() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, DOUBLE_COL, 1.1));
+ index.add(makeInputRow(minTimestamp + 2, true, DOUBLE_COL, 2.2));
+ index.add(makeInputRow(minTimestamp + 3, true, DOUBLE_COL, 3.3));
+ index.add(makeInputRow(minTimestamp + 4, true, DOUBLE_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, DOUBLE_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new DefaultDimensionSpec(DOUBLE_COL,
DOUBLE_COL, ColumnType.DOUBLE);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(DOUBLE_COL);
+ DimensionSelector dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(1.1, valueSelector.getObject());
+ Assert.assertEquals(1.1, valueSelector.getDouble(), 0.0);
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("1.1",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("1.1", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(DOUBLE_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(2.2, valueSelector.getObject());
+ Assert.assertEquals(2.2, valueSelector.getDouble(), 0.0);
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("2.2",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("2.2", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(DOUBLE_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertEquals(3.3, valueSelector.getObject());
+ Assert.assertEquals(3.3, valueSelector.getDouble(), 0.0);
+ Assert.assertFalse(valueSelector.isNull());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+ Assert.assertEquals("3.3",
dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertEquals("3.3", dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(DOUBLE_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(DOUBLE_COL);
+ dimensionSelector =
columnSelectorFactory.makeDimensionSelector(dimensionSpec);
+ Assert.assertNull(valueSelector.getObject());
+ Assert.assertEquals(1, dimensionSelector.getRow().size());
+
Assert.assertNull(dimensionSelector.lookupName(dimensionSelector.getRow().get(0)));
+ Assert.assertNull(dimensionSelector.getObject());
+ }
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryRootStringArray() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, STRING_ARRAY_COL, new
String[]{"a"}));
+ index.add(makeInputRow(minTimestamp + 2, true, STRING_ARRAY_COL, new
Object[]{"b", "c"}));
+ index.add(makeInputRow(minTimestamp + 3, true, STRING_ARRAY_COL,
ImmutableList.of("d", "e")));
+ index.add(makeInputRow(minTimestamp + 4, true, STRING_ARRAY_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, STRING_ARRAY_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new
DefaultDimensionSpec(STRING_ARRAY_COL, STRING_ARRAY_COL, ColumnType.STRING);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_ARRAY_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(0).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(new Object[]{"a"}),
valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_ARRAY_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(1).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(new Object[]{"b", "c"}),
valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_ARRAY_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(2).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ // raw data is left as is, so is currently still a list while in
incremental index...
+ Assert.assertEquals(StructuredData.wrap(ImmutableList.of("d", "e")),
valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_ARRAY_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(3).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector =
columnSelectorFactory.makeColumnValueSelector(STRING_ARRAY_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(4).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+ }
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryRootVariant() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, VARIANT_COL, "a"));
+ index.add(makeInputRow(minTimestamp + 2, true, VARIANT_COL, 2L));
+ index.add(makeInputRow(minTimestamp + 3, true, VARIANT_COL, 3.3));
+ index.add(makeInputRow(minTimestamp + 4, true, VARIANT_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, VARIANT_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new DefaultDimensionSpec(VARIANT_COL,
VARIANT_COL, ColumnType.STRING);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(VARIANT_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(0).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap("a"), valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(VARIANT_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(1).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(2L), valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(VARIANT_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(2).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(3.3), valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(VARIANT_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(3).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(VARIANT_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(4).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+ }
+
+ @Test
+ public void testNestedColumnIndexerSchemaDiscoveryNested() throws
IndexSizeExceededException
+ {
+ long minTimestamp = System.currentTimeMillis();
+ IncrementalIndex index = makeIncrementalIndex(minTimestamp);
+
+ index.add(makeInputRow(minTimestamp + 1, true, NESTED_COL, "a"));
+ index.add(makeInputRow(minTimestamp + 2, true, NESTED_COL, 2L));
+ index.add(makeInputRow(minTimestamp + 3, true, NESTED_COL,
ImmutableMap.of("x", 1.1, "y", 2L)));
+ index.add(makeInputRow(minTimestamp + 4, true, NESTED_COL, null));
+ index.add(makeInputRow(minTimestamp + 5, false, NESTED_COL, null));
+
+ IncrementalIndexStorageAdapter storageAdapter = new
IncrementalIndexStorageAdapter(index);
+ Sequence<Cursor> cursorSequence = storageAdapter.makeCursors(
+ null,
+ Intervals.ETERNITY,
+ VirtualColumns.EMPTY,
+ Granularities.NONE,
+ false,
+ null
+ );
+ final DimensionSpec dimensionSpec = new DefaultDimensionSpec(NESTED_COL,
NESTED_COL, ColumnType.STRING);
+ List<Cursor> cursorList = cursorSequence.toList();
+ ColumnSelectorFactory columnSelectorFactory =
cursorList.get(0).getColumnSelectorFactory();
+
+ ColumnValueSelector valueSelector =
columnSelectorFactory.makeColumnValueSelector(NESTED_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(0).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap("a"), valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(1).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(NESTED_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(1).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(2L), valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(2).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(NESTED_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(2).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertEquals(StructuredData.wrap(ImmutableMap.of("x", 1.1, "y",
2L)), valueSelector.getObject());
+ Assert.assertFalse(valueSelector.isNull());
+
+ columnSelectorFactory = cursorList.get(3).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(NESTED_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(3).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+
+ columnSelectorFactory = cursorList.get(4).getColumnSelectorFactory();
+ valueSelector = columnSelectorFactory.makeColumnValueSelector(NESTED_COL);
+ Assert.assertThrows(
+ UnsupportedOperationException.class,
+ () ->
cursorList.get(4).getColumnSelectorFactory().makeDimensionSelector(dimensionSpec)
+ );
+ Assert.assertNull(valueSelector.getObject());
+ }
+
+ @Nonnull
+ private static IncrementalIndex makeIncrementalIndex(long minTimestamp)
+ {
+ IncrementalIndex index = new OnheapIncrementalIndex.Builder()
+ .setIndexSchema(
+ new IncrementalIndexSchema(
+ minTimestamp,
+ new TimestampSpec(TIME_COL, "millis", null),
+ Granularities.NONE,
+ VirtualColumns.EMPTY,
+ new DimensionsSpec(Collections.emptyList()),
+ new AggregatorFactory[0],
+ false
+ )
+ )
+ .setMaxRowCount(1000)
+ .setUseNestedColumnIndexerForSchemaDiscovery(true)
+ .build();
+ return index;
+ }
+
+ private MapBasedInputRow makeInputRow(
Review Comment:
Perhaps you could make this `public static` on `MapBasedInputRow`? Seems
like a nice enough helper thing to have available.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]