GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/2833

    [CARBONDATA-3026] clear expired property that may cause GC problem

    During data loading, we will write some temp files (sort temp
    files and temp fact data files) in some locations. In currently
    implementation, we will add the locations to the CarbonProperties and
    associated it with a special key that refers to the data loading.
    
    After data loading, the temp locations are cleared, but the added
    property is still remain in the CarbonProperties and never to be cleared.
    
    This will cause the CarbonProperties object growing bigger and bigger
    and lead to OOM problems if the thrift-server is a long time running
    service. A local test shows that after adding different properties for
    11 Billion times, the OOM happens.
    
    In this commit, I clear the property for the locations when we clear the
    locations.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [x] Any interfaces changed?
     `NO`
     - [x] Any backward compatibility impacted?
      `NO`
     - [x] Document update required?
     `NO`
     - [x] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests 
are required?
    `NO`
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance 
test report.
    `NO`
            - Any additional information to help reviewers in testing this 
change.
    `NO`
           
     - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
    `NA`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata 
181018_bug_remove_property

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2833
    
----
commit 736de746250ab33d8c1e4c02158739edde28664e
Author: xuchuanyin <xuchuanyin@...>
Date:   2018-10-18T06:12:30Z

    clear expired property that may cause GC problem
    
    During data loading, we will write some temp files (sort temp
    files and temp fact data files) in some locations. In currently
    implementation, we will add the locations to the CarbonProperties and
    associated it with a special key that refers to the data loading.
    
    After data loading, the temp locations are cleared, but the added
    property is still remain in the CarbonProperties and never to be cleared.
    
    This will cause the CarbonProperties object growing bigger and bigger
    and lead to OOM problems if the thrift-server is a long time running
    service. A local test shows that after adding different properties for
    11 Billion times, the OOM happens.
    
    In this commit, I clear the property for the locations when we clear the
    locations.

----


---

Reply via email to