Manish Gupta created CARBONDATA-2638:
----------------------------------------
Summary: Implement driver min max caching for specified columns
and segregate block and blocklet cache
Key: CARBONDATA-2638
URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
Project: CarbonData
Issue Type: New Feature
Reporter: Manish Gupta
Assignee: Manish Gupta
Attachments: Driver_Block_Cache.docx
*Background*
Current implementation of Blocklet dataMap caching in driver is that it caches
the min and max values of all the columns in schema by default.
*Problem*
Problem with this implementation is that as the number of loads increases the
memory required to hold min and max values also increases considerably. We know
that in most of the scenarios there is a single driver and memory configured
for driver is less as compared to executor. With continuous increase in memory
requirement driver can even go out of memory which makes the situation further
worse.
*Solution*
1. Cache only the required columns in Driver
2. Segregation of block and Blocklet level cache**
For more details please check the attached document
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)