jihoonson opened a new pull request #11597:
URL: https://github.com/apache/druid/pull/11597


   ### Description
   
   This PR improves the performance of the segments table by 1) exploiting 
pre-sorted data structure and 2) caching results to avoid repeated execution of 
expensive operations. For 1), when the published segments caching is enabled, 
the cached published segments in `MetadataSegmentView` is sorted in the same 
order used to sort the available segments in `DruidSchema`. When a query scans 
segments, the query engine now can merge published and available segments using 
the merge sorted algorithm as they are sorted in the same order. The previous 
hash-based merge algorithm is still used when the caching is disabled. Also, 
`druid.sql.planner.forceHashBasedMergeForSegmentsTable` is added to force the 
hash-based merge even when the caching is enabled, which is off by default 
(merge-sorted is on by default).
   
   For 2), scanning the segments table requires to convert timestamps to 
strings and serialize other non-primitive types to JSON strings. Since string 
conversion and JSON serialization are expensive operations, a cache is added to 
avoid calling those operations for the same object repeatedly. The cache can be 
GCed once the scan is finished.
   
   The benchmark results are in the below. This might be a bit biased since all 
segments had the same metadata, where the technique used for 2) can show the 
best performance. In reality, since segments can have different segment 
granularity, dimensions, and metrics, the performance gain could be lower than 
what is shown here.
   
   ```
   master
   
    Benchmark                            (availableSegmentsInterval)  
(forceHashBasedMerge)  (numSegmentsPerInterval)  (publishedSegmentsInterval)  
(segmentGranularity)  (sql)  Mode  Cnt     Score    Error  Units
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                        10               2021-01-02/P3Y              
     DAY      0  avgt   10    30.596 ±  0.399  ms/op
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                        10               2021-01-02/P3Y              
     DAY      1  avgt   10    39.703 ±  0.238  ms/op
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                       100               2021-01-02/P3Y              
     DAY      0  avgt   10   382.867 ±  6.577  ms/op
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                       100               2021-01-02/P3Y              
     DAY      1  avgt   10   387.911 ±  3.867  ms/op
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                      1000               2021-01-02/P3Y              
     DAY      0  avgt   10  3971.577 ± 70.304  ms/op
    SystemSchemaBenchmark.segmentsTable               2021-01-01/P3Y            
       true                      1000               2021-01-02/P3Y              
     DAY      1  avgt   10  5334.702 ± 44.773  ms/op
   ```
   
   ```
   PR
   
    * Benchmark                             (availableSegmentsInterval)  
(forceHashBasedMerge)  (numSegmentsPerInterval)  (publishedSegmentsInterval)  
(segmentGranularity)  (sql)  Mode  Cnt     Score    Error  Units
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                        10               2021-01-02/P3Y           
        DAY      0  avgt   10    14.914 ±  0.305  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                        10               2021-01-02/P3Y           
        DAY      1  avgt   10    21.591 ±  0.139  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                       100               2021-01-02/P3Y           
        DAY      0  avgt   10   199.416 ±  2.208  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                       100               2021-01-02/P3Y           
        DAY      1  avgt   10   203.366 ±  4.706  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                      1000               2021-01-02/P3Y           
        DAY      0  avgt   10  2110.735 ± 65.326  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
          true                      1000               2021-01-02/P3Y           
        DAY      1  avgt   10  2615.910 ± 69.914  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                        10               2021-01-02/P3Y           
        DAY      0  avgt   10    11.967 ±  0.308  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                        10               2021-01-02/P3Y           
        DAY      1  avgt   10    18.917 ±  0.220  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                       100               2021-01-02/P3Y           
        DAY      0  avgt   10   123.630 ±  0.839  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                       100               2021-01-02/P3Y           
        DAY      1  avgt   10   132.013 ±  0.664  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                      1000               2021-01-02/P3Y           
        DAY      0  avgt   10  1303.304 ± 40.063  ms/op
    * SegmentsTableBenchmark.segmentsTable               2021-01-01/P3Y         
         false                      1000               2021-01-02/P3Y           
        DAY      1  avgt   10  1799.447 ± 23.659  ms/op
   ```
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MetadataSegmentView`
    * `SegmentsTableRow`
    * `SystemSchema`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not 
all of these items apply to every PR. Remove the items which are not done or 
not relevant to the PR. None of the items from the checklist below are strictly 
necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   - [x] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to