[GitHub] leerho commented on issue #6743: IncrementalIndex generally overestimates theta sketch size

GitBox Wed, 19 Dec 2018 17:06:24 -0800

leerho commented on issue #6743: IncrementalIndex generally overestimates theta
sketch size
URL:
https://github.com/apache/incubator-druid/issues/6743#issuecomment-448816707

@gianm

I don't have a deep understanding of the inner workings of the memory
allocation strategy in Druid, but I should point out that the current model of
allocating equal sized slots in a Buffer where each slot is the maximum
possible size of a sketch Is very likely a horrible waste of memory space.

Sketches, are designed to start very small in their memory requirements and
then grow gradually as the size of the input stream grows. Unique counting
sketches (Theta, HLL, CPC) actually reach a maximum size, beyond which they do
not grow at all. Quantile sketches can continue to grow but very very slowly
(sub-logarithmically with the size of the input stream).

For example, a Theta Sketch configured with LgK = 14 (~16K) has a maximum
size of about 256KB. However, with only one entry it consumes only 16 Bytes.

Our experience in using sketches in large systems is that the size of the
input streams that feed the sketches tend to follow a power-law distribution.
Which means given a system total number of sketches of one million, the vast
majority of sketches will have very few entries, and just a few sketches will
be full size.

> Note: If the sketches ingested into Druid are pre-aggregated sectors, for
example, this power-law distribution in space required for the sketches is not
as strong, as most of the sketches will already be at max size. But in this
case you are dealing with far fewer sketches, so the memory management concern
may not be as severe.

To illustrate this power-law effect, I have created a toy model of space
utilization in the attached excel spreadsheet:

[DruidModel_pwrLaw.xlsx](https://github.com/apache/incubator-druid/files/2696971/DruidModel_pwrLaw.xlsx)

For this model I assumed a power-law distribution of stream sizes from one
unique to 134M uniques and a configured sketch size LgK = 14. I adjusted the
parameters so that the total number of sketches in the system is about one
million.

With these assumptions, the actual storage bytes required by all of the
sketches is about 440MB. However, allocating the full size of 256KB for every
sketch would require nearly 35GB. This results in a space utilization of about
1.27%! This is a huge waste of memory!

***
What I would recommend is that if we could work together, we could come up
with a much more efficient memory management model for sketches in Druid that
would allow you to recapture most if not all of that wasted space. This will
likely require some changes in Druid as well as a change in how sketches use
and allocate memory.

One idea I have is to allow each sketch to self-allocate the space it needs
as it grows. The sketch would be responsible for allocating new space and
deallocating old space (especially for off-heap).
What would live in the equal-sized slots of the Druid buffer would be a
small, fixed sized "wrapper" with a pointer and a few other values, where the
data structure of the sketch would be in an off-heap space managed by the
sketch itself. This means that Druid memory manager would have to allow for
some amount of off-heap space that it is not controlling. In the toy model
above, that would be about 440MB of off-heap space. The benefit is that Druid
would get back nearly 35GB of off-heap space for other uses. It seems like a
big win to me, but I don't understand what changes would have to happen in
Druid to make this work.

We can certainly do the sketch work, but you (or someone deep into Druid)
would have to make the appropriate changes in Druid. One important aspect of
the Druid work would be to provide some simple tools to analyze the actual
distributions of stream sizes, sketch sizes and space utilizations before and
after the change. This meta-data would be invaluable feedback for us to
understand specific Druid customers' situations, and help us tune the system
for even better performance going forward.

What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] leerho commented on issue #6743: IncrementalIndex generally overestimates theta sketch size

Reply via email to