GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2789

    [HOTFIX] Fixed S3 metrics issue.

    Problem: When data read from s3 it shows the data read as more than the 
size of carbon data total size.
    Reason: It happens because carbondata uses `dataInputStream.skip` but in s3 
interface it cannot handle properly it reads in a loop and reads more data than 
required.
    Solution: Use `FSDataInputStream.seek` instead of skip to fix this issue.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests 
are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance 
test report.
            - Any additional information to help reviewers in testing this 
change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata s3-metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2789.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2789
    
----
commit 59eded7ecdc1d0e0536dbd0e1f77373c9866bf83
Author: ravipesala <ravi.pesala@...>
Date:   2018-09-28T12:59:08Z

    Fixed S3 metrics cache.

----


---

Reply via email to