Hey guys,
I back on the tez user group just because i think my problem is more
focused on Tez -- but lemme know if you think otherwise.
Thanks to Gopal again for getting me to think about the hadoop classpath
when starting HS2. I again changed classpath in the same way i had to
change it for the hive client. So now i've got much further down the Tez
path via HS2 but not quite all the way.
I get to the stage where the Tez job is ACCEPTED by yarn but there it
encounters some communication issues trying to communicate with the
Application Manager on the data nodes (i think)
Here's a snippet from the HS2 log now.
{code}
2016-02-18 18:40:12,259 INFO [HiveServer2-Background-Pool: Thread-157]:
tez.TezSessionState (TezSessionState.java:open(180)) - Opening new Tez
Session (id: 5d94006e-8a85-4ed0-a8ce-e9c86d1f3d28, scratch dir: hdfs://
dwrdevnn1.sv2.trulia.com:8020/tmp/hive/dwr/_tez_session_dir/5d94006e-8a85-4ed0-a8ce-e9c86d1f3d28
)
2016-02-18 18:40:12,292 INFO [HiveServer2-Background-Pool: Thread-157]:
client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to
ResourceManager at dwrdevnn1.sv2.trulia.com/172.19.103.136:8032
2016-02-18 18:40:12,293 INFO [HiveServer2-Background-Pool: Thread-157]:
client.TezClient (TezClient.java:start(394)) - Session mode. Starting
session.
2016-02-18 18:40:12,294 INFO [HiveServer2-Background-Pool: Thread-157]:
client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(176))
- Using tez.lib.uris value from configuration: hdfs://
dwrdevnn1.sv2.trulia.com:8020/apps/tez-0.8.2,hdfs://dwrdevnn1.sv2.trulia.com:8020/apps/tez-0.8.2/lib
2016-02-18 18:40:12,354 INFO [HiveServer2-Background-Pool: Thread-157]:
client.TezClient (TezCommonUtils.java:createTezSystemStagingPath(122)) -
Tez system stage directory hdfs://
dwrdevnn1.sv2.trulia.com:8020/tmp/hive/dwr/_tez_session_dir/5d94006e-8a85-4ed0-a8ce-e9c86d1f3d28/.tez/application_1455811467110_0307
doesn't exist and is created
2016-02-18 18:40:12,360 INFO [HiveServer2-Background-Pool: Thread-157]:
Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1087)) -
fs.default.name is deprecated. Instead, use fs.defaultFS
2016-02-18 18:40:12,525 INFO [HiveServer2-Background-Pool: Thread-157]:
impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) -
Submitted application application_1455811467110_0307
2016-02-18 18:40:12,528 INFO [HiveServer2-Background-Pool: Thread-157]:
client.TezClient (TezClient.java:start(428)) - The url to track the Tez
Session:
http://dwrdevnn1.sv2.trulia.com:8088/proxy/application_1455811467110_0307/
2016-02-18 18:40:18,122 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 0
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:19,123 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 1
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:20,124 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 2
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:21,125 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 3
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:22,126 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 4
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:23,128 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 5
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:24,129 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 6
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:25,130 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 7
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:26,131 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 8
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:27,132 INFO [HiveServer2-Background-Pool: Thread-157]:
ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to
server: dwrdevdn13.sv2.trulia.com/172.19.79.129:44618. Already tried 9
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
2016-02-18 18:40:27,138 INFO [HiveServer2-Background-Pool: Thread-157]:
client.TezClient (TezClient.java:getAppMasterStatus(710)) - Failed to
retrieve AM Status via proxy
com.google.protobuf.ServiceException: java.net.ConnectException: Call From
dwrdevnn1.sv2.trulia.com/172.19.103.136 to dwrdevdn13.sv2.trulia.com:44618
failed on connection exception: java.net.ConnectException: Connection
refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:246)
at com.sun.proxy.$Proxy31.getAMStatus(Unknown Source)
at
org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:703)
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:782)
at
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:205)
at
org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:239)
at
org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
at
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
{code}
So my question is whatya suppose is causing this? I'm pretty darn sure the
classpath is legit now.
Cheers,
Stephen.
PS fwiw, this is how i'm setting classpath.
#needed to get around that jansi class error
export HADOOP_USER_CLASSPATH_FIRST=true
export
HADOOP_CLASSPATH=$HIVE_HOME/lib/hive-exec-1.2.1.jar:/home/dwr/jansi-1.11.jar:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*