[
https://issues.apache.org/jira/browse/HADOOP-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030201#comment-18030201
]
ASF GitHub Bot commented on HADOOP-18800:
-----------------------------------------
github-actions[bot] closed pull request #6039: HADOOP-18800. Bad
ipc.client.connection.idle-scan-interval.ms cause resource leaks
URL: https://github.com/apache/hadoop/pull/6039
> Bad ipc.client.connection.idle-scan-interval.ms cause resource leaks
> --------------------------------------------------------------------
>
> Key: HADOOP-18800
> URL: https://issues.apache.org/jira/browse/HADOOP-18800
> Project: Hadoop Common
> Issue Type: Bug
> Components: common, conf, ipc
> Reporter: ConfX
> Priority: Critical
> Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> When setting ipc.client.connection.idle-scan-interval.ms to a bad value (e.g.
> a negative value), Hadoop Server fails to schedule the idle connection scan
> task and causes resource leaks.
> h2. Buggy code:
> {code:java}
> private void scheduleIdleScanTask() {
> ...
> TimerTask idleScanTask = new TimerTask(){
> @Override
> public void run() {
> ...
> try {
> closeIdle(false);
> } finally {
> // explicitly reschedule so next execution occurs relative
> // to the end of this scan, not the beginning
> scheduleIdleScanTask();
> }
> }
> };
> idleScanTimer.schedule(idleScanTask, idleScanInterval); // <---
> idleScanInterval is a negative value
> }
> {code}
> In schedule, the task will not be scheduled if the delay is negative, which
> causes resource leaks due to unscheduled idleScanTask.
> {code:java}
> public void schedule(TimerTask task, long delay) {
> if (delay < 0)
> throw new IllegalArgumentException("Negative delay.");
> sched(task, System.currentTimeMillis()+delay, 0); // <-- the task
> will not be scheduled when delay is negative
> }
> {code}
> h2. How to reproduce:
> We can use the test org.apache.hadoop.ipc.TestIPC#testSocketLeak to check the
> resource leaks.
> (1) Set ipc.client.connection.idle-scan-interval.ms to -1;
> (2) Run test org.apache.hadoop.ipc.TestIPC#testSocketLeak
> (3) You will see the following message (note that the number of leaked
> descriptors can vary from run to run):
> {code}
> java.lang.AssertionError: Leaked 142 file descriptors
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at org.apache.hadoop.ipc.TestIPC.testSocketLeak(TestIPC.java:1155)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.lang.Thread.run(Thread.java:829)
> {code}
> You can use the reproduce.sh in the attachment to easily reproduce the bug:
> We are happy to provide a patch if this issue is confirmed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]