[DISCUSSION] Distributed Index Cache Server

Kunal Kapoor Tue, 05 Feb 2019 03:14:25 -0800

Hi All,

Carbon currently caches all block/blocklet datamap index information into
the driver. And for bloom type of datamap, it can prune the splits in a
distributed way using distributed datamap pruning. In the first case, there
are limitations like driver memory scale up and reusing of one driver cache
by others is not possible. In the second case, there are limitations like
there is no guarantee that the next query goes to the same executor to
reuse the cache.



Based on the above problems there is a need to have a centralised index
cache server.


Please find below the link for the design document.


https://docs.google.com/document/d/161NXxrKLPucIExkWip5mX00x2iOPH6bvsuQnCzzp47E/edit?ts=5c542ab4#heading=h.x0qaehgkncz5



Thanks

Kunal Kapoor

[DISCUSSION] Distributed Index Cache Server

Reply via email to