[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566690
 ]

ASF GitHub Bot logged work on MAPREDUCE-7329:
-

Author: ASF GitHub Bot
Created on: 16/Mar/21 03:01
Start Date: 16/Mar/21 03:01
Worklog Time Spent: 10m 
  Work Description: liuml07 commented on pull request #2775:
URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799906569


   I did not touch this pipe class or package. I can take a look later this 
week, but @aajisaka and @wangdatan will have more context. Thanks,



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 566690)
Time Spent: 50m  (was: 40m)

> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 2.6.0, 3.0.0
>
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we 
> upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task 
> exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which may cause critical problem: 
>  *  it will cause tcp accept queue full(default 50)
>  *  when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept 
> queue never cleared.
>  * Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (default 2048), so from client side, ping will aslo work till 
> sync queue full. And after 3 hours, task will also exit with connect timeout 
> exception.
> To fix this bug, we introduced a PingSocketCleaner thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=54=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-54
 ]

ASF GitHub Bot logged work on MAPREDUCE-7329:
-

Author: ASF GitHub Bot
Created on: 16/Mar/21 01:14
Start Date: 16/Mar/21 01:14
Worklog Time Spent: 10m 
  Work Description: lichaojacobs edited a comment on pull request #2775:
URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799868579


   @liuml07 @steveloughran  Could you please help review the patch?  It aims to 
fix Hadoop Pipes bug. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 54)
Time Spent: 40m  (was: 0.5h)

> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 2.6.0, 3.0.0
>
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we 
> upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task 
> exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which may cause critical problem: 
>  *  it will cause tcp accept queue full(default 50)
>  *  when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept 
> queue never cleared.
>  * Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (default 2048), so from client side, ping will aslo work till 
> sync queue full. And after 3 hours, task will also exit with connect timeout 
> exception.
> To fix this bug, we introduced a PingSocketCleaner thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=53=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-53
 ]

ASF GitHub Bot logged work on MAPREDUCE-7329:
-

Author: ASF GitHub Bot
Created on: 16/Mar/21 01:14
Start Date: 16/Mar/21 01:14
Worklog Time Spent: 10m 
  Work Description: lichaojacobs commented on pull request #2775:
URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799868579


   @liuml07  Could you please help review the patch?  It aims to fix Hadoop 
Pipes bug. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 53)
Time Spent: 0.5h  (was: 20m)

> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 2.6.0, 3.0.0
>
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we 
> upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task 
> exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which may cause critical problem: 
>  *  it will cause tcp accept queue full(default 50)
>  *  when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept 
> queue never cleared.
>  * Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (default 2048), so from client side, ping will aslo work till 
> sync queue full. And after 3 hours, task will also exit with connect timeout 
> exception.
> To fix this bug, we introduced a PingSocketCleaner thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Description: 
{color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we 
upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task exit 
with connect timeout which is implemented by PingThread in HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!

After a deep research, we finally find that current ping server won't accept 
ping client created socket, which may cause critical problem: 
 *  it will cause tcp accept queue full(default 50)
 *  when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(default 2h), and accept queue never 
cleared.
 * Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (default 2048), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this bug, we introduced a PingSocketCleaner thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detecte closed inputStream reading, then 
finally close socket from sever side.

Refrenced by linux kernel patch: 
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]

 

  was:
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detecte closed inputStream reading, then 
finally close socket from sever side.

Refrenced by linux kernel patch: 
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]

 


> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 2.6.0, 3.0.0
>
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {color:#FF}*Hadoop Pipes Ping implement has a bug*{color}. Recently, we 
> upgrade linux kernel version from 3.x to 4.x. And we find hadoop pipe task 
> exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which may cause critical problem: 
>  *  it will cause tcp accept queue full(default 50)
>  *  when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(default 2h), and accept 
> queue never cleared.
>  * Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (default 2048), so from client side, ping will aslo work till 
> sync queue full. And after 3 hours, task will also exit with connect timeout 
> exception.
> To fix this bug, we introduced a PingSocketCleaner thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread 

[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Fix Version/s: 2.6.0
   3.0.0

> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 2.6.0, 3.0.0
>
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
>  After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Summary: HadoopPipes task may fail when linux kernel version upgrade from 
3.x to 4.x  (was: HadoopPipes task may fail when linux kernel version change 
from 3.x to 4.x)

> HadoopPipes task may fail when linux kernel version upgrade from 3.x to 4.x
> ---
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
>  After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> [https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-15 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302092#comment-17302092
 ] 

Ahmed Hussein commented on MAPREDUCE-7322:
--

Thanks [~Jim_Brennan] for the review and merging the patch.
In order to port this into branch-3.2 and branch-2.10, we will need to port 
MAPREDUCE-7320 because to pull {{GenericTestUtils}} changes.
I can provide patches for both MAPREDUCE-7320 branch-3.2 and branch-2.10 if 
they do not pull clean.

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302010#comment-17302010
 ] 

Jim Brennan commented on MAPREDUCE-7322:


Thanks for the contribution [~ahussein]!  I have committed this to trunk and 
branch-3.3, but the patch does not apply to branch-3.2.    Can you please 
provide a patch for 3.2 and any earlier branches you want this pulled back to?

 

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-15 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7322:
---
Fix Version/s: 3.3.1
   3.4.0

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301480#comment-17301480
 ] 

Hadoop QA commented on MAPREDUCE-7329:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} codespell {color} | {color:blue}  0m  
0s{color} |  | {color:blue} codespell was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} |  | {color:red} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 34m 
 9s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
19s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 50s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | 
[/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt]
 | {color:orange} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: 
The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
20s{color} |  | 

[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566061
 ]

ASF GitHub Bot logged work on MAPREDUCE-7329:
-

Author: ASF GitHub Bot
Created on: 15/Mar/21 08:15
Start Date: 15/Mar/21 08:15
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2775:
URL: https://github.com/apache/hadoop/pull/2775#issuecomment-799212430


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 27s | 
[/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt)
 |  
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: 
The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 12s |  |  hadoop-mapreduce-client-core in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  81m 16s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2775/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2775 |
   | JIRA Issue | MAPREDUCE-7329 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux bc05a6db4771 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 393b71c6182645f660d87b74eb6e2d745200f5b5 |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions | 

[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Description: 
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detecte closed inputStream reading, then 
finally close socket from sever side.

Refrenced by linux kernel patch: 
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]

 

  was:
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detecte closed inputStream reading, then  
will finally close socket from sever side.

Refrenced by linux kernel patch: 
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]

 


> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
>  After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> 

[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Description: 
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detecte closed inputStream reading, then  
will finally close socket from sever side.

Refrenced by linux kernel patch: 
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7]

 

  was:
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detected by inputStream read, then  will 
finally close socket from sever side.

Refrenced by linux kernel patch: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7

 


> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
>  After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detecte closed inputStream reading, then  
> will finally close socket from sever side.
> Refrenced by linux kernel patch: 
> 

[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Description: 
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
 After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detected by inputStream read, then  will 
finally close socket from sever side.

Refrenced by linux kernel patch: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7

 

  was:
Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detected by inputStream read, then  will 
finally close socket from sever side.

 


> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
>  After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detected by inputStream read, then  will 
> finally close socket from sever side.
> Refrenced by linux kernel patch: 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ea8ea2cb7
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?focusedWorklogId=566036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566036
 ]

ASF GitHub Bot logged work on MAPREDUCE-7329:
-

Author: ASF GitHub Bot
Created on: 15/Mar/21 06:53
Start Date: 15/Mar/21 06:53
Worklog Time Spent: 10m 
  Work Description: lichaojacobs opened a new pull request #2775:
URL: https://github.com/apache/hadoop/pull/2775


   jira: https://issues.apache.org/jira/browse/MAPREDUCE-7329
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 566036)
Remaining Estimate: 0h
Time Spent: 10m

> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detected by inputStream read, then  will 
> finally close socket from sever side.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated MAPREDUCE-7329:
--
Labels: patch pull-request-available  (was: patch)

> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detected by inputStream read, then  will 
> finally close socket from sever side.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaoli updated MAPREDUCE-7329:
--
Attachment: 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch

> HadoopPipes task may fail when linux kernel version change from 3.x to 4.x
> --
>
> Key: MAPREDUCE-7329
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: chaoli
>Priority: Major
>  Labels: patch
> Attachments: 
> 0001-MAPREDUCE-7329-HadoopPipes-task-may-fail-when-linux-.patch, 
> image-2021-03-15-14-29-49-475.png, image-2021-03-15-14-37-32-184.png
>
>
> Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
> pipe task exit with connect timeout which is implemented by PingThread in 
> HadoopPipes.cc.
> !image-2021-03-15-14-37-32-184.png!
> After a deep research, we finally find that current ping server won't accept 
> ping client created socket, which has hidden danger: 
>  # it will cause tcp accept queue full(*default 50*)
>  # when client close socket, server socket won't call close method, which 
> will leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept 
> queue never cleared.
>  # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
> directly which makes ping client connect time out. While In 3.x linux kernel 
> version, when accept queue full, client can also make half connection till 
> sync queue full (*default 2048*), so from client side, ping will aslo work 
> till sync queue full. And after 3 hours, task will also exit with connect 
> timeout exception.
> To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
> continuously accept ping socket connect from ping client. When socket close 
> from client,  cleaner thread will detected by inputStream read, then  will 
> finally close socket from sever side.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7329) HadoopPipes task may fail when linux kernel version change from 3.x to 4.x

2021-03-15 Thread chaoli (Jira)
chaoli created MAPREDUCE-7329:
-

 Summary: HadoopPipes task may fail when linux kernel version 
change from 3.x to 4.x
 Key: MAPREDUCE-7329
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7329
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: chaoli
 Attachments: image-2021-03-15-14-29-49-475.png, 
image-2021-03-15-14-37-32-184.png

Recently, we upgrade linux kernel version from 3.x to 4.x. And we find hadoop 
pipe task exit with connect timeout which is implemented by PingThread in 
HadoopPipes.cc.

!image-2021-03-15-14-37-32-184.png!
After a deep research, we finally find that current ping server won't accept 
ping client created socket, which has hidden danger: 
 # it will cause tcp accept queue full(*default 50*)
 # when client close socket, server socket won't call close method, which will 
leave too many CLOSE_WAIT socket fd existed(*default 2h*), and accept queue 
never cleared.
 # Even worse, in 4.x linux kernel version, it will cause tcp drop packet 
directly which makes ping client connect time out. While In 3.x linux kernel 
version, when accept queue full, client can also make half connection till sync 
queue full (*default 2048*), so from client side, ping will aslo work till sync 
queue full. And after 3 hours, task will also exit with connect timeout 
exception.

To fix this problem, we introduced a *PingSocketCleaner* thread, which will 
continuously accept ping socket connect from ping client. When socket close 
from client,  cleaner thread will detected by inputStream read, then  will 
finally close socket from sever side.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org