Manish Gupta created CARBONDATA-2638:
----------------------------------------

             Summary: Implement driver min max caching for specified columns 
and segregate block and blocklet cache
                 Key: CARBONDATA-2638
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
             Project: CarbonData
          Issue Type: New Feature
            Reporter: Manish Gupta
            Assignee: Manish Gupta
         Attachments: Driver_Block_Cache.docx

*Background*

Current implementation of Blocklet dataMap caching in driver is that it caches 
the min and max values of all the columns in schema by default. 

*Problem*
 Problem with this implementation is that as the number of loads increases the 
memory required to hold min and max values also increases considerably. We know 
that in most of the scenarios there is a single driver and memory configured 
for driver is less as compared to executor. With continuous increase in memory 
requirement driver can even go out of memory which makes the situation further 
worse.

*Solution*

1. Cache only the required columns in Driver

2. Segregation of block and Blocklet level cache**

For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to