[
https://issues.apache.org/jira/browse/HADOOP-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15466497#comment-15466497
]
Akira Ajisaka commented on HADOOP-11252:
----------------------------------------
Thanks [~chiwanpark] for the report.
bq. Is there any reason to change visibility? This change breaks *source-level*
compatibility.
Client is annotated as {{@InterfaceStability.Evolving}}, so we can change the
API between minor releases, but not between point releases. I'll file a jira
for fix it.
bq. Giraph is one of broken examples for this changes.
(https://github.com/apache/giraph/blob/release-1.0/giraph-core/src/main/java/org/apache/giraph/job/GiraphJob.java#L202)
You can use the following code instead.
{code}
giraphConfiguration.setInt("ipc.ping.interval", 60000 * 5);
{code}
> RPC client does not time out by default
> ---------------------------------------
>
> Key: HADOOP-11252
> URL: https://issues.apache.org/jira/browse/HADOOP-11252
> Project: Hadoop Common
> Issue Type: Bug
> Components: ipc
> Affects Versions: 2.5.0
> Reporter: Wilfred Spiegelenburg
> Assignee: Masatake Iwasaki
> Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
>
> Attachments: HADOOP-11252.002.patch, HADOOP-11252.003.patch,
> HADOOP-11252.004.patch, HADOOP-11252.patch
>
>
> The RPC client has a default timeout set to 0 when no timeout is passed in.
> This means that the network connection created will not timeout when used to
> write data. The issue has shown in YARN-2578 and HDFS-4858. Timeouts for
> writes then fall back to the tcp level retry (configured via tcp_retries2)
> and timeouts between the 15-30 minutes. Which is too long for a default
> behaviour.
> Using 0 as the default value for timeout is incorrect. We should use a sane
> value for the timeout and the "ipc.ping.interval" configuration value is a
> logical choice for it. The default behaviour should be changed from 0 to the
> value read for the ping interval from the Configuration.
> Fixing it in common makes more sense than finding and changing all other
> points in the code that do not pass in a timeout.
> Offending code lines:
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L488
> and
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L350
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]