richardstartin opened a new pull request #6764: Consider using 
RoaringBitmapWriter for bitmap construction
URL: https://github.com/apache/incubator-druid/pull/6764
 
 
   I noticed you are investigating using RoaringBitmap 0.7.30. If you do so, it 
is worth considering a different mechanism to build your bitmaps which favours 
ordered insertions by buffering them into a container and appending to the 
bitmap as late as possible (just before the bitmap is queried, or when the next 
multiple of 2^16 is crossed). There is no need to manually run optimise bitmaps 
built this way, because they are run optimised at the container level whenever 
it is appended to the bitmap.
   
   I am not a druid user and am unlikely to become one soon, so this PR is 
intended as an FYI about the feature only. 
   
   I ran the `BitmapIterationBenchmark.constructAndIter` on JDK8 on Ubuntu 
16.0.4 at `7a09cde4de1953eee75c5033e863cfde8f94d6c1` and got:
   
   ```
   Benchmark                                     (bitmapAlgo)  (n)  (prob)   
(size)  Mode  Cnt          Score          Error  Units
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.0  
1000000  avgt    5         11.958 ±        0.411  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A   0.001  
1000000  avgt    5      55820.663 ±     4765.600  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.1  
1000000  avgt    5     853821.933 ±    10916.693  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.5  
1000000  avgt    5    3014089.931 ±    65283.409  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A    0.99  
1000000  avgt    5    5628379.542 ±   227488.488  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     1.0  
1000000  avgt    5    5612304.605 ±    54199.692  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.0  
1000000  avgt    5          8.073 ±        0.178  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A   0.001  
1000000  avgt    5      27473.710 ±      626.528  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.1  
1000000  avgt    5    3635751.625 ±    56888.246  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.5  
1000000  avgt    5    9798233.678 ±   237069.198  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A    0.99  
1000000  avgt    5    9588921.943 ±   214705.602  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     1.0  
1000000  avgt    5    8077899.901 ±   118088.071  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.0  
1000000  avgt    5        131.791 ±        2.237  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A   0.001  
1000000  avgt    5      46860.367 ±     1753.583  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.1  
1000000  avgt    5    1709465.928 ±    38854.875  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.5  
1000000  avgt    5    6898408.274 ±   210501.998  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A    0.99  
1000000  avgt    5   13340397.558 ±   283832.841  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     1.0  
1000000  avgt    5   13415893.194 ±   170437.084  ns/op
   ```
   
   At `b313193c81ed868a9afe04c658306705f63daaef` I got:
   ```
   Benchmark                                  (bitmapAlgo)  (prob)   (size)  
Mode  Cnt         Score         Error  Units
   BitmapIterationBenchmark.constructAndIter        bitset     0.0  1000000  
avgt    5        12.665 ±       1.104  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset   0.001  1000000  
avgt    5     74471.073 ±   54445.042  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     0.1  1000000  
avgt    5    887366.201 ±   35143.327  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     0.5  1000000  
avgt    5   3166248.403 ±  495669.632  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset    0.99  1000000  
avgt    5   6324809.163 ± 1012027.080  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     1.0  1000000  
avgt    5   5913067.177 ±  132629.211  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.0  1000000  
avgt    5         8.068 ±       0.115  ns/op
   BitmapIterationBenchmark.constructAndIter       concise   0.001  1000000  
avgt    5     27547.146 ±     546.018  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.1  1000000  
avgt    5   3635772.683 ±   56079.798  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.5  1000000  
avgt    5  10400194.474 ±  147495.368  ns/op
   BitmapIterationBenchmark.constructAndIter       concise    0.99  1000000  
avgt    5   9409295.891 ±  122748.484  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     1.0  1000000  
avgt    5   8641773.847 ±  193212.416  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.0  1000000  
avgt    5        75.142 ±       1.408  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring   0.001  1000000  
avgt    5     13348.323 ±     205.987  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.1  1000000  
avgt    5   1300962.745 ±   30107.110  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.5  1000000  
avgt    5   5255974.593 ±  133759.112  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring    0.99  1000000  
avgt    5  11122438.742 ±  168793.197  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     1.0  1000000  
avgt    5  11555370.664 ±  141606.255  ns/op
   ```
   These quick results are quite noisy so require more careful consideration on 
your part. I am also unaware of how likely out of order insertions are in a 
typical druid workload, where there would be no penalty for using this 
abstraction, but there will be no benefit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to