[ 
https://issues.apache.org/jira/browse/TEZ-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638118#comment-14638118
 ] 

TezQA commented on TEZ-2630:
----------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12746705/TEZ-2630.2.patch
  against master revision 19fb440.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/910//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/910//console

This message is automatically generated.

> TezChild receives IP address instead of FQDN 
> ---------------------------------------------
>
>                 Key: TEZ-2630
>                 URL: https://issues.apache.org/jira/browse/TEZ-2630
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Rajat Jain
>            Assignee: Hitesh Shah
>            Priority: Critical
>         Attachments: TEZ-2630.2.patch, TEZ-2630.patch
>
>
> I am running a yarn cluster on AWS. The slave nodes (NMs) are all configured 
> to listen on private DNS. For example, a sample node manager listens on 
> ip-10-16-141-168.ec2.internal:8042.
> When I'm trying to run a Tez job (even simple ones like select count(*) from 
> nation) - they fail because child tasks are unable to connect to the AM. The 
> issue is they are trying to connect to the IP instead of the private DNS. 
> Here's a sample log line (couple of them added by me for debugging):
> {code}
> 2015-07-21 17:08:21,919 INFO [main] task.TezChild: TezChild starting
> 2015-07-21 17:08:22,310 INFO [main] task.TezChild: Using socket factory 
> class: org.apache.hadoop.net.StandardSocketFactory
> 2015-07-21 17:08:22,336 INFO [main] task.TezChild: PID, containerIdentifier:  
> 3699, container_1437498369268_0001_01_000002
> 2015-07-21 17:08:22,418 INFO [main] Configuration.deprecation: 
> fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-07-21 17:08:23,025 INFO [main] task.TezChild: Got host:port: 
> 10.16.141.168:37949
> 2015-07-21 17:08:23,035 INFO [main] task.TezChild: address variables: 
> 10.16.141.168:37949
> 2015-07-21 17:08:23,143 INFO [TezChild] task.ContainerReporter: Attempting to 
> fetch new task
> 2015-07-21 17:08:24,201 INFO [TezChild] ipc.Client: Retrying connect to 
> server: 10.16.141.168/10.16.141.168:37949. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 
> MILLISECONDS)
> 2015-07-21 17:08:25,202 INFO [TezChild] ipc.Client: Retrying connect to 
> server: 10.16.141.168/10.16.141.168:37949. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 
> MILLISECONDS)
> 2015-07-21 17:08:26,757 INFO [TezChild] ipc.Client: Retrying connect to 
> server: 10.16.141.168/10.16.141.168:37949. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 
> MILLISECONDS)
> 2015-07-21 17:08:27,758 INFO [TezChild] ipc.Client: Retrying connect to 
> server: 10.16.141.168/10.16.141.168:37949. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 
> MILLISECONDS)
> {code}
> AM is listening at the right address. But TezChild is receiving the IP 
> address instead of the private DNS. 
> AM logs:
> {code}
> 2015-07-21 18:09:27,906 INFO 
> [ServiceThread:org.apache.tez.dag.app.TaskAttemptListenerImpTezDag] 
> app.TaskAttemptListenerImpTezDag: Listening at address: 
> ip-10-234-2-80.ec2.internal:49967
> {code}
> TezChild logs:
> {code}
> 2015-07-21 18:09:35,353 INFO [main] task.TezChild: TezChild starting
> 2015-07-21 18:09:35,379 INFO [main] task.TezChild: Args: 
> 10.234.2.80,49967,container_1437501941642_0001_01_000002,application_1437501941642_0001,1
> 2015-07-21 18:09:35,770 INFO [main] task.TezChild: Using socket factory 
> class: org.apache.hadoop.net.StandardSocketFactory
> 2015-07-21 18:09:35,785 INFO [main] task.TezChild: PID, containerIdentifier:  
> 8670, container_1437501941642_0001_01_000002
> 2015-07-21 18:09:35,864 INFO [main] Configuration.deprecation: 
> fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-07-21 18:09:36,403 INFO [main] task.TezChild: Got host:port: 
> 10.234.2.80:49967
> 2015-07-21 18:09:36,413 INFO [main] task.TezChild: address variables: 
> 10.234.2.80:49967
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to