[ 
https://issues.apache.org/jira/browse/TEZ-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098937#comment-17098937
 ] 

László Bodor edited comment on TEZ-4157 at 5/5/20, 1:25 PM:
------------------------------------------------------------

 [^TEZ-4157.02.patch]  is about the first successful refactor to netty4, most 
of the unit tests pass, except testKeepAlive, which has started to drive me 
crazy, but I'll give another chance to it

[~jeagles]: do you have some pointers regarding testKeepAlive, maybe you're 
familiar with that testcase...I'm 99% sure that my netty upgrade is correct in  
[^TEZ-4157.02.patch], and all of the test cases pass (except 
testKeepAlive)...in testKeepAlive, there are 2 consecutive keepalive 
connections from the client, and the 
[second|https://github.com/apache/tez/blob/master/tez-plugins/tez-aux-services/src/test/java/org/apache/tez/auxservices/TestShuffleHandler.java#L474]
 fails with invalid http response after my patch...
could you please clarify the expected behavior of this test case, [regarding 
broken 
pipe|https://github.com/apache/tez/blob/master/tez-plugins/tez-aux-services/src/test/java/org/apache/tez/auxservices/TestShuffleHandler.java#L403]?
 I've been playing with this test case for more than 8-10 hours, but I haven't 
been able to solve it...basically:
1. if I insert a Thread.sleep(1000) before the second getInputStream, the 
connection is successful, but it than it fails because the second socket 
address is not the same, so I think it's not a keepalive anymore
2. without the sleep, I got invalid http response no matter how I change the 
payload from the fake shuffle handler...
what's exacly the point of this very [long cycle and big 
payload|https://github.com/apache/tez/blob/master/tez-plugins/tez-aux-services/src/test/java/org/apache/tez/auxservices/TestShuffleHandler.java#L410]?
 do we expect the buffer fill cycle itself to take longer than the keepalive 
timeout?

cc: [~rizhang]


was (Author: abstractdog):
 [^TEZ-4157.02.patch]  is about the first successful refactor to netty4, most 
of the unit tests pass, except testKeepAlive, which has started to drive me 
crazy, but I'll give another chance to it

> ShuffleHandler: upgrade to netty4
> ---------------------------------
>
>                 Key: TEZ-4157
>                 URL: https://issues.apache.org/jira/browse/TEZ-4157
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: TEZ-4157.01.patch, TEZ-4157.02.patch
>
>
> -In the dependency tree, there are 2 occurrences of compile scope direct 
> netty dependencies, however, they're not used at all. I compiled locally 
> successfully without them. E.g. when investigating blackduck alerts 
> (complaining about netty deps for current 3.10.5.Final), it would be cleaner 
> to start from a dependency tree where Tez doesn't depend on netty directly in 
> order to eliminate its responsibility (and move the focus to underlying 
> hadoop for instance).-
> Tez depends on netty3 almost only in ShuffleHandler and some related classes. 
> We can eliminate netty3 by upgrading it, but this effort might involve some 
> testing due to fundamental [changes from 
> netty3->netty4|https://netty.io/wiki/new-and-noteworthy-in-4.0.html] + we 
> don't have a reference yet, as [hadoop's 
> ShuffleHandler|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java]
>  is still on netty3.
> As per the netty documentation, we can also expect some performance 
> improvement (e.g. Pooled buffers).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to