GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/2531

    [HOTFIX] Improved BlockDataMap caching performance during first time query

    Things done as part of this PR
    1. Created taskSumamry and FileFooterEntry schema once and stored in member 
variable. Everytime creation of schema was a costly operation as time to prune 
dataMaps increased because of that.
    2. Used TreeMap instead of HashMap while adding the complete file path and 
data to the map diring merge file read. Using TreeMap improved the map filling 
performance by 10 sec for 1200 entries.
    
     - [ ] Any interfaces changed?
     No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    No
     - [ ] Testing done
    Verified manually       
     - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
    NA


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata query_perf

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2531.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2531
    
----
commit 26954b88d606535349f83f80a3e00f9b2db4fd66
Author: manishgupta88 <tomanishgupta18@...>
Date:   2018-07-19T13:45:12Z

    Code modification done to improve query performance

----


---

Reply via email to