[
https://issues.apache.org/jira/browse/HIVE-15102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261509#comment-16261509
]
Ashutosh Chauhan commented on HIVE-15102:
-----------------------------------------
In some of the recent test runs some batches are timing out (ofcourse randomly
but rarely). I looked into log of one such failure and found it contains
following:
{code}
2017-11-18T10:30:01,231 WARN [Fetcher_O {Map_1} #0]
orderedgrouped.FetcherOrderedGrouped: Failed to connect to
hive-ptest-slaves-aff.c.gcp-hive-upstream.internal:0 with 1 inputs
java.io.IOException: Failed to connect to
http://hive-ptest-slaves-aff.c.gcp-hive-upstream.internal:0/mapOutput?job=job_1511029513075_0001&dag=203&reduce=0&map=attempt_1511029513075_0001_203_00_000000_0_11576,
#connectionFailures=3
at org.apache.tez.http.HttpConnection.connect(HttpConnection.java:168)
~[tez-runtime-library-0.9.1-SNAPSHOT.jar:0.9.1-SNAPSHOT]
{code}
Above suggested to me that some slaves went away in middle of test execution
resulting in those time outs.
> Hiveptest is killing nodes where IP is reused after previous node termination
> -----------------------------------------------------------------------------
>
> Key: HIVE-15102
> URL: https://issues.apache.org/jira/browse/HIVE-15102
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 2.2.0
> Reporter: Sergio Peña
> Assignee: Sergio Peña
> Attachments: HIVE-15102.1.patch
>
>
> NO PRECOMMIT TESTS
> The Hiveptest framework has a background thread that runs every hour, and
> attempts to kill zombie nodes that are not being used by the test execution
> anymore.
> These killed nodes are kept in a list of terminated nodes, and next time the
> background thread is executed, it will attempt to kill all those nodes again
> because Hiveptest consider them as zombie nodes.
> The problem is that cloud providers can give you the same IP numbers for new
> nodes, and when the background thread runs, it will kill those nodes that may
> still be in used by Hiveptest.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)