kfaraz opened a new issue #12022:
URL: https://github.com/apache/druid/issues/12022


   ### Motivation
   
   The existing implementation in `OnHeapIncrementalIndex` tends to 
over-estimate memory usage
   thus leading to more persistence cycles than necessary during ingestion. A 
more accurate estimation
   of mem usage would also free it up for other purposes.
   
   The recent changes in #11950 have made improvements by adding a 
`guessAggregatorHeapFootprint()` method to the `Aggregator` interface. This 
proposal also addresses the same problem but advocates replacing the guess 
mechanism with getting the actual incremental memory used by an aggregator.
   
   ### Proposed changes
   
   - Update the method `Aggregator.aggregate()` to return a `long` instead of 
`void`. The returned long would represent the incremental memory in bytes used 
by the aggregator in that particular invocation of the method.
   - Remove the method `DimensionIndexer.estimateEncodedKeyComponentSize()`
   - Update the method `DimensionIndexer.getUnsortedEncodedValueFromSorted()` 
to return objects of a new generic class `EncodedDimensionValue<EncodedType>` 
which contains:
      - `EncodedType value`: e.g. `int[]` for `StringDimIndexer`, `Long` for 
`LongDimIndexer`
      - `long incrementalSize`: The delta in size required for the `value`. For 
numerical values, e.g. in `LongDimensionIndexer`, it would just be the size of 
the datatype. But for `StringDimensionIndexer`, which returns an encoded `int 
[]`, it would represent the size of the array and also any _new_ dimension 
value that has been encoded into the dictionary in this invocation. Simply put, 
the `getUnsortedEncodedValueFromSorted()` now returns a payload and also the 
memory required for that payload.
   
   ### Rationale
   
   Estimation of a row size
   `row size = aggregator size + dims key size + overhead`
   
   __Aggregator Size__
   Currently, we compute the _max_ size for the aggregator and use it for every 
row, which overestimates the actual memory usage. With the proposed change, we 
would be using the actual footprint for each row and not the max thus getting 
more accurate estimates.
   
   __Dims Key Size__
   The estimation of the dimension key size already takes into account the 
current row and not the max size. But here, the overestimates are caused by 
repeatedly adding the footprint of the same String values (especially in the 
case of multi-valued dimensions) which are in fact stored _only once_ in the 
dictionary and only the integers codes are present in every row. With the 
proposed change, we add the footprint of a String value only once when it is 
being newly added to the dictionary.
           
   ### Backwards Compatibility
   
   Neither of the changes mentioned above would be backwards compatible.
   
   __Aggregator Size__
   Any class implementing the `Aggregator` interface would need to be fixed to 
return a `long` instead of `void`.
   
   Workaround (Rejected):
   To retain compatibilty, we could add a new `default long 
aggregateAndEstimateMemory()` method and leave the existing `aggregate()` 
method as is. The default implementation would return a negative value (say -1) 
in which case the `OnHeapIncrementalIndex` or any other estimator would use the 
max estimated size for that invocation. 
   
   But this approach would be pretty hacky and problematic in the long run as 
callers of the `Aggregator` would be free to call either of the two aggregate 
methods thus producing widely different and erroneous memory estimations.
   
   __Dims Key Size__
   Any class implementing the `DimensionIndexer` would have to be fixed. (This 
change is less of a compatibility concern as compared to the `Aggregator` 
change as the `DimensionIndexer` has fewer implementations)
   
   Workaround (Rejected):
   To retain compatibility, we could retain the 
`DimensionIndexer.estimateEncodedKeyComponentSize()` which would account for a 
String value only when it is newly encountered. But this would require having 
two dictionaries, the first used by the method 
`getUnsortedEncodedValueFromSorted()` to track encoded String values encoding 
and the second used by `estimateEncodedKeyComponentSize()` to track estimated 
String values.
   
   This workaround would introduce unnecessary complexity and overhead of 
maintaining two dictionaries.
   
   ### Operational impact
   
   - All aggregator implementations in extensions would have to be updated.
   - Rolling Upgrade: Not affected
   
   ### Future Work (optional)
   
   The memory estimate values returned by the updated `aggregate()` method 
could be used by other callers (such as `TimeseriesQueryEngine`) to estimate 
memory if required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to