GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2345

    [wip] Improve Carbon Reader Schema reading performance on S3

    Problem : Currently carbon reader is reading schema from carbondata file. 
On s3 multiple IO happens as buffer size is small and data file size is big.
    
    Solution: Read schema from index file and do once IO of index file with a 
buffer size equal to index file size.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed? NA
     
     - [ ] Any backward compatibility impacted? NA
     
     - [ ] Document update required? NA
    
     - [ ] Testing done
            Added UT
           
     - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.  NA
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata master_new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2345
    
----
commit 59b248bbef97010dc2f5dc697400bb2f85799425
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-05-27T17:19:23Z

    Improve Carbon Reader Schema reading on S3

----


---

Reply via email to