[
https://issues.apache.org/jira/browse/HDFS-15865?focusedWorklogId=579895&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-579895
]
ASF GitHub Bot logged work on HDFS-15865:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Apr/21 10:46
Start Date: 09/Apr/21 10:46
Worklog Time Spent: 10m
Work Description: sodonnel commented on a change in pull request #2728:
URL: https://github.com/apache/hadoop/pull/2728#discussion_r610529393
##########
File path:
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
##########
@@ -905,6 +907,14 @@ void waitForAckedSeqno(long seqno) throws IOException {
}
try {
dataQueue.wait(1000); // when we receive an ack, we notify on
+ long duration = Time.monotonicNow() - begin;
+ if (duration > writeTimeout) {
+ LOG.error("No ack received, took {}ms (threshold={}ms). "
+ + "File being written: {}, block: {}, "
+ + "Write pipeline datanodes: {}.",
+ duration, writeTimeout, src, block, nodes);
+ throw new InterruptedIOException("No ack received. ");
Review comment:
I think it would be good to log the duration waited and perhaps the
timeout in the exception, eg:
```
throw new InterruptedIOException("No ack received after " + duration / 1000
+ "s and a timeout of " + writeTimeout / 1000 + "s");
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 579895)
Time Spent: 1h 10m (was: 1h)
> Interrupt DataStreamer thread
> -----------------------------
>
> Key: HDFS-15865
> URL: https://issues.apache.org/jira/browse/HDFS-15865
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Karthik Palanisamy
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Have noticed HiveServer2 halts due to DataStreamer#waitForAckedSeqno.
> I think we have to interrupt DataStreamer if no packet ack(from datanodes).
> It likely happens with infra/network issue.
> {code:java}
> "HiveServer2-Background-Pool: Thread-35977576" #35977576 prio=5 os_prio=0
> cpu=797.65ms elapsed=3406.28s tid=0x00007fc0c6c29800 nid=0x4198 in
> Object.wait() [0x00007fc1079f3000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(java.base(at)11.0.5/Native Method)
> - waiting on <no object reference available>
> at
> org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:886)
> - waiting to re-lock in wait() <0x00007fe6eda86ca0> (a
> java.util.LinkedList){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]