GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/2833
[CARBONDATA-3026] clear expired property that may cause GC problem
During data loading, we will write some temp files (sort temp
files and temp fact data files) in some locations. In currently
implementation, we will add the locations to the CarbonProperties and
associated it with a special key that refers to the data loading.
After data loading, the temp locations are cleared, but the added
property is still remain in the CarbonProperties and never to be cleared.
This will cause the CarbonProperties object growing bigger and bigger
and lead to OOM problems if the thrift-server is a long time running
service. A local test shows that after adding different properties for
11 Billion times, the OOM happens.
In this commit, I clear the property for the locations when we clear the
locations.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [x] Any interfaces changed?
`NO`
- [x] Any backward compatibility impacted?
`NO`
- [x] Document update required?
`NO`
- [x] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests
are required?
`NO`
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance
test report.
`NO`
- Any additional information to help reviewers in testing this
change.
`NO`
- [x] For large changes, please consider breaking it into sub-tasks under
an umbrella JIRA.
`NA`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/xuchuanyin/carbondata
181018_bug_remove_property
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2833.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2833
----
commit 736de746250ab33d8c1e4c02158739edde28664e
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-10-18T06:12:30Z
clear expired property that may cause GC problem
During data loading, we will write some temp files (sort temp
files and temp fact data files) in some locations. In currently
implementation, we will add the locations to the CarbonProperties and
associated it with a special key that refers to the data loading.
After data loading, the temp locations are cleared, but the added
property is still remain in the CarbonProperties and never to be cleared.
This will cause the CarbonProperties object growing bigger and bigger
and lead to OOM problems if the thrift-server is a long time running
service. A local test shows that after adding different properties for
11 Billion times, the OOM happens.
In this commit, I clear the property for the locations when we clear the
locations.
----
---