[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566690 ] ASF GitHub Bot logged work on MAPREDUCE-7329: - Author: ASF GitHub Bot Created on: 16/Mar/21 03:01 Start Date: 16/Mar/21 03:01 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2775: URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799906569 I did not touch this pipe class or package. I can take a look later this week, but @aajisaka and @wangdatan will have more context. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566690) Time Spent: 50m (was: 40m) > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Fix For: 2.6.0, 3.0.0 > > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 50m > Remaining Estimate: 0h > > {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we > upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task > exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which may cause critical problem: > * it will cause tcp accept queue full(default 50) > * when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept > queue never cleared. > * Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (default 2048), so from client side, ping will aslo work till > sync queue full. And after 3 hours, task will also exit with connect timeout > exception. > To fix this bug, we introduced a PingSocketCleaner thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: > [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=54=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-54 ] ASF GitHub Bot logged work on MAPREDUCE-7329: - Author: ASF GitHub Bot Created on: 16/Mar/21 01:14 Start Date: 16/Mar/21 01:14 Worklog Time Spent: 10m Work Description: lichaojacobs edited a comment on pull request #2775: URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799868579 @liuml07 @steveloughran Could you please help review the patch? It aims to fix Hadoop Pipes bug. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 54) Time Spent: 40m (was: 0.5h) > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Fix For: 2.6.0, 3.0.0 > > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 40m > Remaining Estimate: 0h > > {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we > upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task > exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which may cause critical problem: > * it will cause tcp accept queue full(default 50) > * when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept > queue never cleared. > * Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (default 2048), so from client side, ping will aslo work till > sync queue full. And after 3 hours, task will also exit with connect timeout > exception. > To fix this bug, we introduced a PingSocketCleaner thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: > [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=53=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-53 ] ASF GitHub Bot logged work on MAPREDUCE-7329: - Author: ASF GitHub Bot Created on: 16/Mar/21 01:14 Start Date: 16/Mar/21 01:14 Worklog Time Spent: 10m Work Description: lichaojacobs commented on pull request #2775: URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799868579 @liuml07 Could you please help review the patch? It aims to fix Hadoop Pipes bug. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 53) Time Spent: 0.5h (was: 20m) > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Fix For: 2.6.0, 3.0.0 > > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we > upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task > exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which may cause critical problem: > * it will cause tcp accept queue full(default 50) > * when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept > queue never cleared. > * Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (default 2048), so from client side, ping will aslo work till > sync queue full. And after 3 hours, task will also exit with connect timeout > exception. > To fix this bug, we introduced a PingSocketCleaner thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: > [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Description: {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which may cause critical problem: * it will cause tcp accept queue full(default 50) * when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept queue never cleared. * Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (default 2048), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this bug, we introduced a PingSocketCleaner thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detecte closed inputStream reading, then finally close socket from sever side. Refrenced by linux kernel patch: [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] was: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detecte closed inputStream reading, then finally close socket from sever side. Refrenced by linux kernel patch: [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Fix For: 2.6.0, 3.0.0 > > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 20m > Remaining Estimate: 0h > > {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we > upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task > exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which may cause critical problem: > * it will cause tcp accept queue full(default 50) > * when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept > queue never cleared. > * Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (default 2048), so from client side, ping will aslo work till > sync queue full. And after 3 hours, task will also exit with connect timeout > exception. > To fix this bug, we introduced a PingSocketCleaner thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Fix Version/s: 2.6.0 3.0.0 > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Fix For: 2.6.0, 3.0.0 > > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 20m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: > [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Summary: HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x (was: HadoopPipes task may fail when linux kernel version change from 3.x to 4.x) > HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x > --- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 20m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: > [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption
[ https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302092#comment-17302092 ] Ahmed Hussein commented on MAPREDUCE-7322: -- Thanks [~Jim_Brennan] for the review and merging the patch. In order to port this into branch-3.2 and branch-2.10, we will need to port MAPREDUCE-7320 because to pull {{GenericTestUtils}} changes. I can provide patches for both MAPREDUCE-7320 branch-3.2 and branch-2.10 if they do not pull clean. > revisiting TestMRIntermediateDataEncryption > > > Key: MAPREDUCE-7322 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission, security, test >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: patch-available > Fix For: 3.4.0, 3.3.1 > > Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, > MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, > MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, > MAPREDUCE-7322.009.patch > > > I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has > actually little to do with encryption. > I have the following conclusion: > * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not > change the behavior of the unit test. > * There are no spill files generated by either mappers/reducers > * Wrapping I/O streams with Crypto never happens during the execution of the > unit test. > Unless I misunderstand the purpose of that unit test, I suggest that it gets > re-implemented so that it validates encryption in spilled intermediate data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption
[ https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302010#comment-17302010 ] Jim Brennan commented on MAPREDUCE-7322: Thanks for the contribution [~ahussein]! I have committed this to trunk and branch-3.3, but the patch does not apply to branch-3.2. Can you please provide a patch for 3.2 and any earlier branches you want this pulled back to? > revisiting TestMRIntermediateDataEncryption > > > Key: MAPREDUCE-7322 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission, security, test >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: patch-available > Fix For: 3.4.0, 3.3.1 > > Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, > MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, > MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, > MAPREDUCE-7322.009.patch > > > I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has > actually little to do with encryption. > I have the following conclusion: > * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not > change the behavior of the unit test. > * There are no spill files generated by either mappers/reducers > * Wrapping I/O streams with Crypto never happens during the execution of the > unit test. > Unless I misunderstand the purpose of that unit test, I suggest that it gets > re-implemented so that it validates encryption in spilled intermediate data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption
[ https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated MAPREDUCE-7322: --- Fix Version/s: 3.3.1 3.4.0 > revisiting TestMRIntermediateDataEncryption > > > Key: MAPREDUCE-7322 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission, security, test >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: patch-available > Fix For: 3.4.0, 3.3.1 > > Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, > MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, > MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, > MAPREDUCE-7322.009.patch > > > I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has > actually little to do with encryption. > I have the following conclusion: > * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not > change the behavior of the unit test. > * There are no spill files generated by either mappers/reducers > * Wrapping I/O streams with Crypto never happens during the execution of the > unit test. > Unless I misunderstand the purpose of that unit test, I suggest that it gets > re-implemented so that it validates encryption in spilled intermediate data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301480#comment-17301480 ] Hadoop QA commented on MAPREDUCE-7329: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} codespell {color} | {color:blue} 0m 0s{color} | | {color:blue} codespell was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 34m 9s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 19s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 50s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no blanks issues. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 27s{color} | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt] | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 20s{color} | |
[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566061 ] ASF GitHub Bot logged work on MAPREDUCE-7329: - Author: ASF GitHub Bot Created on: 15/Mar/21 08:15 Start Date: 15/Mar/21 08:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2775: URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799212430 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 9s | | trunk passed | | +1 :green_heart: | compile | 0m 44s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 22s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 0m 20s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 14m 50s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 34s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 0m 34s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 27s | [/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt) | hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) | | +1 :green_heart: | mvnsite | 0m 32s | | the patch passed | | +1 :green_heart: | javadoc | 0m 17s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 0m 14s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 20s | | the patch passed | | +1 :green_heart: | shadedclient | 14m 28s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 7m 12s | | hadoop-mapreduce-client-core in the patch passed. | | +1 :green_heart: | asflicense | 0m 33s | | The patch does not generate ASF License warnings. | | | | 81m 16s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2775 | | JIRA Issue | MAPREDUCE-7329 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux bc05a6db4771 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 393b71c6182645f660d87b74eb6e2d745200f5b5 | | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | Multi-JDK versions |
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Description: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detecte closed inputStream reading, then finally close socket from sever side. Refrenced by linux kernel patch: [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] was: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detecte closed inputStream reading, then will finally close socket from sever side. Refrenced by linux kernel patch: [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 10m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > finally close socket from sever side. > Refrenced by linux kernel patch: >
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Description: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detecte closed inputStream reading, then will finally close socket from sever side. Refrenced by linux kernel patch: [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7] was: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detected by inputStream read, then will finally close socket from sever side. Refrenced by linux kernel patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7 > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 10m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detecte closed inputStream reading, then > will finally close socket from sever side. > Refrenced by linux kernel patch: >
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Description: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detected by inputStream read, then will finally close socket from sever side. Refrenced by linux kernel patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7 was: Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detected by inputStream read, then will finally close socket from sever side. > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 10m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detected by inputStream read, then will > finally close socket from sever side. > Refrenced by linux kernel patch: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566036 ] ASF GitHub Bot logged work on MAPREDUCE-7329: - Author: ASF GitHub Bot Created on: 15/Mar/21 06:53 Start Date: 15/Mar/21 06:53 Worklog Time Spent: 10m Work Description: lichaojacobs opened a new pull request #2775: URL: https://github.com/apache/hadoop/pull/2775 jira: https://issues.apache.org/jira/browse/MAPREDUCE-7329 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566036) Remaining Estimate: 0h Time Spent: 10m > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 10m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detected by inputStream read, then will > finally close socket from sever side. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated MAPREDUCE-7329: -- Labels: patch pull-request-available (was: patch) > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch, pull-request-available > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > Time Spent: 10m > Remaining Estimate: 0h > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detected by inputStream read, then will > finally close socket from sever side. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaoli updated MAPREDUCE-7329: -- Attachment: 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch > HadoopPipes task may fail when linux kernel version change from 3.x to 4.x > -- > > Key: MAPREDUCE-7329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: chaoli >Priority: Major > Labels: patch > Attachments: > 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, > image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png > > > Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop > pipe task exit with connect timeout which is implemented by PingThread in > HadoopPipes.cc. > !image-2021-03-15-14-37-32-184.png! > After a deep research, we finally find that current ping server won't accept > ping client created socket, which has hidden danger: > # it will cause tcp accept queue full(*default 50*) > # when client close socket, server socket won't call close method, which > will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept > queue never cleared. > # Even worse, in 4.x linux kernel version, it will cause tcp drop packet > directly which makes ping client connect time out. While In 3.x linux kernel > version, when accept queue full, client can also make half connection till > sync queue full (*default 2048*), so from client side, ping will aslo work > till sync queue full. And after 3 hours, task will also exit with connect > timeout exception. > To fix this problem, we introduced a *PingSocketCleaner* thread, which will > continuously accept ping socket connect from ping client. When socket close > from client, cleaner thread will detected by inputStream read, then will > finally close socket from sever side. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
chaoli created MAPREDUCE-7329: - Summary: HadoopPipes task may fail when linux kernel version change from 3.x to 4.x Key: MAPREDUCE-7329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: chaoli Attachments: image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit with connect timeout which is implemented by PingThread in HadoopPipes.cc. !image-2021-03-15-14-37-32-184.png! After a deep research, we finally find that current ping server won't accept ping client created socket, which has hidden danger: # it will cause tcp accept queue full(*default 50*) # when client close socket, server socket won't call close method, which will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue never cleared. # Even worse, in 4.x linux kernel version, it will cause tcp drop packet directly which makes ping client connect time out. While In 3.x linux kernel version, when accept queue full, client can also make half connection till sync queue full (*default 2048*), so from client side, ping will aslo work till sync queue full. And after 3 hours, task will also exit with connect timeout exception. To fix this problem, we introduced a *PingSocketCleaner* thread, which will continuously accept ping socket connect from ping client. When socket close from client, cleaner thread will detected by inputStream read, then will finally close socket from sever side. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org