[ https://issues.apache.org/jira/browse/CLOUDSTACK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247746#comment-16247746 ]
ASF subversion and git services commented on CLOUDSTACK-10136: -------------------------------------------------------------- Commit 3ee8d83621c23f976413fdce6d9245197497d504 in cloudstack's branch refs/heads/master from [~rohit.ya...@shapeblue.com] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=3ee8d83 ] CLOUDSTACK-10136: Fix RemoteHostEndPoint thread growth This fixes the following: - Unchecked thread growth in RemoteEndHostEndPoint - Potential NPE while finding EP for a storage/scope Unbounded thread growth can be reproduced with following findings: - Every unreachable template would produce 6 new threads (in a single ScheduledExecutorService instance) spaced by 10 seconds - Every reachable template url without the template would produce 1 new thread (and one ScheduledExecutorService instance), it errors out quickly without causing more thread growth. - Every valid url will produce upto 10 threads as the same ep (endpoint instance) will be reused to query upload/download (async callback) progresses. Every RemoteHostEndPoint instances creates its own ScheduledExecutorService instance which is why in the jstack dump, we see several threads that share the prefix RemoteHostEndPoint-{1..10} (given poolsize is defined as 10, it uses suffixes 1-10). This fixes the discovered thread leakage with following notes: - Instead of ScheduledExecutorService instance, a cached pool could be used instead and was implemented, and with `static` scope to be reused among other future RemoteHostEndPoint instances. - It was not clear why we would want to wait when we've Answers returned from the remote EP, and therefore a scheduled/delayed Runnable was not required at all for processing answers. ScheduledExecutorService was therefore not really required, moved to ExecutorService instead. - Another benefit of using a cached pool is that it will shutdown threads if they are not used in 60 seconds, and they get re-used for future runnable submissions. - Caveat: the executor service is still unbounded, however, the use-case that this method is used for short jobs to check upload/download progresses fits the case here. - Refactored CmdRunner to not use/reference objects from parent class. Signed-off-by: Rohit Yadav <rohit.ya...@shapeblue.com> > Fix thread growth/leak issue > ---------------------------- > > Key: CLOUDSTACK-10136 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10136 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: 4.5.2, 4.6.2, 4.7.1, 4.10.0.0, 4.9.2.0, 4.8.1.1, 4.9.3.0 > Reporter: Rohit Yadav > Assignee: Rohit Yadav > Fix For: 4.11.0.0 > > > For long running mgmt server with large amounts of templates etc, large > amounts of waiting threads are seen that start with the 'RemoteHostEndPoint-' > prefix. These async threads are responsible mostly for checking > template/volume upload/download progress/states. They kick everytime a > template is being checked/downloaded setup etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)