Possibly fixes https://github.com/apache/incubator-druid/issues/4937
--------
There is currently a race condition in IncrementalIndexStorageAdapter that can
lead to exceptions like the following, when running queries with filters on
String dimensions that hit realtime tasks:
```
org.apache.druid.java.util.common.ISE: id[5] >= maxId[5]
at
org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591)
at
org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562)
at
org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284)
```
When the `filterMatcher` is created in the constructor of
`IncrementalIndexStorageAdapter.IncrementalIndexCursor`,
`StringDimensionIndexer.makeDimensionSelector` gets called eventually, which
calls:
```
final int maxId = getCardinality();
...
@Override
public int getCardinality()
{
return dimLookup.size();
}
```
So `maxId` is set to the size of the dictionary at the time that the
`filterMatcher` is created.
However, the `maxRowIndex` which is meant to prevent the Cursor from returning
rows that were added after the Cursor was created (see
https://github.com/apache/incubator-druid/pull/4049) is set after the
`filterMatcher` is created.
If rows with new dictionary values are added after the `filterMatcher` is
created but before `maxRowIndex` is set, then it is possible for the Cursor to
return rows that contain the new values, which will have `id >= maxId`.
This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows
with unknown dictionary IDs from being passed to the `filterMatcher`.
-----------
The included test triggers the error with a custom Filter +
DruidPredicateFactory.
The DimensionSelector for predicate-based filter matching is created here in
`Filters.makeValueMatcher`:
```
public static ValueMatcher makeValueMatcher(
final ColumnSelectorFactory columnSelectorFactory,
final String columnName,
final DruidPredicateFactory predicateFactory
)
{
final ColumnCapabilities capabilities =
columnSelectorFactory.getColumnCapabilities(columnName);
// This should be folded into the ValueMatcherColumnSelectorStrategy once
that can handle LONG typed columns.
if (capabilities != null && capabilities.getType() == ValueType.LONG) {
return getLongPredicateMatcher(
columnSelectorFactory.makeColumnValueSelector(columnName),
predicateFactory.makeLongPredicate()
);
}
final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector =
DimensionHandlerUtils.createColumnSelectorPlus(
ValueMatcherColumnSelectorStrategyFactory.instance(),
DefaultDimensionSpec.of(columnName),
columnSelectorFactory
);
return
selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(),
predicateFactory);
}
```
The test Filter adds a row to the IncrementalIndex in the test when the
predicateFactory creates a new String predicate, after the DimensionSelector is
created.
[ Full content available at:
https://github.com/apache/incubator-druid/pull/6340 ]
This message was relayed via gitbox.apache.org for [email protected]