[ 
https://issues.apache.org/jira/browse/CARBONDATA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-3006:
-------------------------------------
    Summary: Carbon Store Size Optimization and Query Performance Improvement  
(was: Carbon Store Size Optimization and Scan Query Performance Improvement)

> Carbon Store Size Optimization and Query Performance Improvement
> ----------------------------------------------------------------
>
>                 Key: CARBONDATA-3006
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3006
>             Project: CarbonData
>          Issue Type: Improvement
>            Reporter: kumar vishal
>            Priority: Major
>
> *String/Varchar Datatype Store Size Optimization:*
> Currently length is stored as Short/Int for String/varchar datatype because 
> of this store size is more. To reduce the store size Adaptive encoding is 
> applied for length part irrespective of String/Varchar type so during 
> processing there will not be separate handling for String/Varchar datatype.
> *String/Varchar datatype query processing optimization:*
> Currently for processing the String/Varchar datatype during query 
> offset(positions of data) is calculated and based on position data is 
> fetched. Because of this many cacheline misses is happening and its degrading 
> query performance.
> To handle this for full scan query with no inverted index, data is fetched is 
> in linear way to avoid cache line misses.
> *Adaptive encoding for Global/Direct/Local dictionary columns*
> Currently Global/Direct/Local dictionary are stored in binary format and only 
> snappy is applied for compression. As Global/Direct/Local dictionary values 
> are of Integer data type it can adaptability stored with the data type 
> smaller than Integer.
> Added adaptive for global/direct dictionary column to reduce the store size.
> *Method In-lining Optimization*
> JIT will inline any method if method size is less than 325 byte code size and 
> if it is called more than 10K times(default value). If method is private or 
> static it will be easier for JIT to inline as type safe check is not 
> required, for protected/public method it will add a overhead of type check 
> and because of this it will not behave as inline.
> Because of above case some refactoring is done for primitive no dictionary 
> data type columns. Earlier ColumnPageWrapper.java was handling query 
> processing for all primitive no dictionary data type column now in This PR 
> separate classes are created for each data type handling and all the HOT 
> method is kept as private and protected methods are overridden and other 
> methods are added in Super classes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to