leventov opened a new issue #8354: Explain the processing buffer sizing in 
documentation
URL: https://github.com/apache/incubator-druid/issues/8354
 
 
   Currently, [the documentation just 
says](https://github.com/apache/incubator-druid/blob/d00747774208dbbfcb272ee7d1c30cf879887838/docs/operations/basic-cluster-tuning.md)
 "`druid.processing.buffer.sizeBytes` can be set to 500MB." and "A size between 
500MB and 1GB is a reasonable choice for general use." (BTW, this information 
is repeated three times on the same page; probably we also want to restructure 
it somehow.)
   
   This guideline lacks an explanation of why 500MB is desirable and when 
somebody would want to configure even larger buffers. For example, if somebody 
doesn't run groupBy queries at all, 500MB maybe be much more than ever needed 
for other types of queries like topN and timeseries, *unless* they use large 
complex aggregators (like histograms).
   
   So the doc should explain the relation between the cardinalities of the 
columns involved into the query, the aggregation size, and the recommended 
processing buffer size, as well as what will happen if the processing buffers 
will be undersized or oversized.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to