[
https://issues.apache.org/jira/browse/GEODE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100919#comment-16100919
]
ASF GitHub Bot commented on GEODE-3286:
---------------------------------------
GitHub user WireBaron opened a pull request:
https://github.com/apache/geode/pull/657
GEODE-3286: Failing to cleanup connections from ConnectionTable recei…
…ver table
@kohlmu-pivotal @galen-pivotal @pivotal-amurmann @bschuchardt @hiteshk25
- prevent adding a closed connection to the connection table's receivers
- add a new unit test for connection table
- adding connection factory object for creating receiving connections
- have the idle connection timeout ensure connections are removed from
connection
table receivers
- modify tcpConduit stat accesses to allow easier mocking
Signed-off-by: Hitesh Khamesra <[email protected]>
Thank you for submitting a contribution to Apache Geode.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced in
the commit message?
- [x] Has your PR been rebased against the latest commit within the target
branch (typically `develop`)?
- [x] Is your initial contribution a single, squashed commit?
- [x] Does `gradlew build` run cleanly?
- [x] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and
submit an update to your PR as soon as possible. If you need help, please
send an
email to [email protected].
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/WireBaron/geode feature/GEODE-3286
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/geode/pull/657.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #657
----
commit 8aed26846de6e9ff1c123acae98a7b5ce6d82a83
Author: Brian Rowe <[email protected]>
Date: 2017-07-25T22:43:35Z
GEODE-3286: Failing to cleanup connections from ConnectionTable receiver
table
- prevent adding a closed connection to the connection table's receivers
- add a new unit test for connection table
- adding connection factory object for creating receiving connections
- have the idle connection timeout ensure connections are removed from
connection
table receivers
- modify tcpConduit stat accesses to allow easier mocking
Signed-off-by: Hitesh Khamesra <[email protected]>
----
> Failing to cleanup connections from ConnectionTable receiver table
> ------------------------------------------------------------------
>
> Key: GEODE-3286
> URL: https://issues.apache.org/jira/browse/GEODE-3286
> Project: Geode
> Issue Type: Bug
> Components: membership
> Reporter: Brian Rowe
>
> This bug tracks gemfire issue 1554
> (https://jira-pivotal.atlassian.net/browse/GEM-1544).
> Hello team,
> A customer (VMWare) is experiencing several {{OutOfMemoryError}} on
> production servers, and they believe there's a memory leak within GemFire.
> Apparently 9.5GB of the heap heap is occupied by 487,828 instances of
> {{sun.security.ssl.SSLSocketImpl}}, and 7.7GB of the heap is occupied by
> 487,804 instances of {{sun.security.ssl.AppOutputStream}}, both referenced
> from the {{receivers}} attribute within the {{ConnectionTable}} class. I got
> this information from the Eclipse Memory Analyzer plugin, the images are
> attached.
> Below are some OQLs that I was able to run within the plugin, it is weird
> that the collection of receivers is composed of 486.368 elements...
> {code}
> SELECT * FROM com.gemstone.gemfire.internal.tcp.ConnectionTable
> -> 1
> SELECT receivers.size FROM com.gemstone.gemfire.internal.tcp.ConnectionTable
> -> 486.368
> SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection
> -> 487.758
> SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE
> con.stopped = true
> -> 486.461
> SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE
> con.stopped = false
> -> 1297
> {code}
> That said, nothing in the statistics (maybe there's something, but I can't
> find it...) seems to point to a spike in the amount of entries within the
> regions, neither in the current amount of connections, nor anything to be
> able to explain the continuous drop of the available heap over time
> (chart#freeMemory).
> The heap dump (approximately 20GB) and the statistics (don't have logs yet,
> but they might not be required by looking at the heap and the statistics)
> have been uploaded to [Google
> Drive|https://drive.google.com/drive/folders/0BxDMZZTfEL4WUFZjbjhLMXptbEk?usp=sharing].
> Just for the record, apparently we delivered 8.2.0.6 to them a year and half
> ago as a fix to [GEM-94|https://jira-pivotal.atlassian.net/browse/GEM-94] /
> [GEODE-332|https://issues.apache.org/jira/browse/GEODE-332], they've been
> running fine since then, until now. The last change in the
> {{ConnectionTable}} was done to fix these issues, so if there's actually a
> bug within the class, it will also exist on 8.2.5 (just a reminder to change
> the affected version field if required).
> The issue is not reproducible at will but happens in several of their
> environments, yet I haven't been able to reproduce it in my lab environment
> for now.
> Please let me know if you need anything else to proceed.
> Best regards.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)