clintropolis opened a new pull request #8822: optimize numeric column null value checking for low filter selectivity (more rows) URL: https://github.com/apache/incubator-druid/pull/8822 ### Description I would like to add documentation for Druid SQL compatible null handling added in #4349 to the website, in part because it has been merged for quite a while now, but also so that we can feel better about the changes #8566 causes in SQL behavior in Druids native mode (tl;dr some calcite optimizations lead to some query incompatibilities specifically around our treatment of `''` and `null` as equivalent). Part of like.. being responsible and stuff, before adding this documentation I first wanted to collect some data to determine if it _was a good idea to in fact document it at this time_. The place I was specifically worried about was the `isNull` check, so I added a benchmark, `NullHandlingBitmapGetVsIteratorBenchmark`, and ran some experiments to see what to expect as well as if we could do better. The nature of selectors lends itself to using a bitmap iterator as an alternative to calling bitmap.get for every `isNull` check, so I first tested `get` vs `iterator` with various null bitmap densities and filter selectivities to simulate the overhead of having null value handling selector with each approach. I have so far only collected information on roaring, because with `concise` the `get` method would not complete at low selectivity, so I'm assuming it's numbers are quite bad (and more on this later). To help make sense of the numbers collected from this, I created heatmaps to compare the differences between the approaches. With the raw output, these looked something like this which isn't so telling: ![raw](https://user-images.githubusercontent.com/1577461/68170194-b52bde00-ff23-11e9-9e84-2047cc67df72.gif) Interpolating the data to fill in the gaps yields something a bit prettier, but still not incredibly telling on it's own: ![difference-iterator-get](https://user-images.githubusercontent.com/1577461/68170242-dbea1480-ff23-11e9-923e-67872816a1a3.gif) But does start to offer that the iterator is better at dealing with a larger number of rows, diminishing as the density of the bitmap increases. Scaling the data points to estimate the _cost per row_ in nanoseconds provides a much more useful picture, and shows the abysmal performance of using the iterator at super high selectivity (e.g. 0.1% of rows match and are selected) with really dense bitmaps: ![difference-per-row-iterator-get](https://user-images.githubusercontent.com/1577461/68170352-426f3280-ff24-11e9-9782-7a47892f0f18.gif) However if we truncate the results and compare areas where the `iterator` is better: ![difference-per-row-iterator-better](https://user-images.githubusercontent.com/1577461/68170371-53b83f00-ff24-11e9-989e-41ec0219d29c.gif) to where `get` is better ![difference-per-row-get-better](https://user-images.githubusercontent.com/1577461/68170380-5b77e380-ff24-11e9-8f4f-538004046187.gif) it does look like the iterator does perform up to 15 `ns per row` better than using `get` when processing a lot of rows. Searching for a way around the limitation, I was unaware that the `IntIterator` for roaring was actually a `PeekableIntIterator` which is an iterator that offers an `advanceIfNeeded` method that allows skipping the iterator ahead to an index. `ConciseSet` actually has a similar method on it's iterator, `skipAllBefore`, which _is used by_ the `contains` method that the `get` of concise uses! This is likely why concise just flat out wouldn't complete the benchmarks when processing a higher number of rows, because every `get` is creating an iterator, skipping to the position, loop checking to see if the iterator contains the index or passes it, and then throwing it away. Adding the 'peekable' iterator to the benchmark had a similar outcome to using the plain iterator, ![difference-peekable-iterator-get](https://user-images.githubusercontent.com/1577461/68170721-c8d84400-ff25-11e9-9c10-47d5ebf40933.gif) but without nearly the overhead at high selectivity on dense bitmaps. ![difference-per-row-peekable-iterator-get](https://user-images.githubusercontent.com/1577461/68170735-d392d900-ff25-11e9-8476-7bb2728fe43b.gif) It's still worse, but not nearly as bad, no more than 60ns slower per row when processing a small number of rows than get, compared to over 2 microseconds for using the plain iterator. Peekable iterator better: ![difference-per-row-peekable-iterator-better](https://user-images.githubusercontent.com/1577461/68170780-f91fe280-ff25-11e9-86b8-6ff026d4813e.gif) Get better than peekable iterator: ![difference-per-row-peekable-iterator-get-better](https://user-images.githubusercontent.com/1577461/68170798-01781d80-ff26-11e9-9daf-ca95788bdb71.gif) Finally, to get a handle on the per row cost in nanoseconds of checking for nulls at all, I threw together these animations: get: ![get-per-row-breakdown](https://user-images.githubusercontent.com/1577461/68171443-a3006e80-ff28-11e9-87be-3894c0978e73.gif) iterator: ![iterator-per-row-breakdown](https://user-images.githubusercontent.com/1577461/68171452-a85db900-ff28-11e9-817a-833b1f64f823.gif) peekable iterator: ![peekable-iterator-per-row-breakdown](https://user-images.githubusercontent.com/1577461/68171456-ae539a00-ff28-11e9-999b-241ee39de895.gif) <hr> This PR has: - [x] been self-reviewed. - [x] added unit tests or modified existing tests to cover new code paths. <hr> ##### Key changed/added classes in this PR * `ImmutableBitmap` * `OurBar` * `TheirBaz`
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org