[
https://issues.apache.org/jira/browse/HBASE-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707518#action_12707518
]
stack commented on HBASE-1394:
------------------------------
The line we're blocked on is below:
20:21 <St^Ack> // If queue is full, then wait till we can create
enough space
20:21 <St^Ack> while (!closed && dataQueue.size() + ackQueue.size() >
maxPackets) {
20:21 <St^Ack> try {
20:21 <St^Ack> dataQueue.wait();
20:21 <St^Ack> } catch (InterruptedException e) {
20:21 <St^Ack> }
20:21 <St^Ack> }
Here is note on why we're blocked -- looks like outstanding packets to send is
at a maximum:
20:26 <St^Ack> * The client application writes data that is cached
internally by
20:26 <St^Ack> * this stream. Data is broken up into packets, each packet is
20:26 <St^Ack> * typically 64K in size. A packet comprises of chunks. Each
chunk
20:26 <St^Ack> * is typically 512 bytes and has an associated checksum with
it.
20:26 <St^Ack> *
20:26 <St^Ack> * When a client application fills up the currentPacket, it is
20:26 <St^Ack> * enqueued into dataQueue. The DataStreamer thread picks up
20:26 <St^Ack> * packets from the dataQueue, sends it to the first datanode
in
20:26 <St^Ack> * the pipeline and moves it from the dataQueue to the
ackQueue.
20:26 <St^Ack> * The ResponseProcessor receives acks from the datanodes.
When an
20:26 <St^Ack> * successful ack for a packet is received from all datanodes,
the
20:26 <St^Ack> * ResponseProcessor removes the corresponding packet from the
20:26 <St^Ack> * ackQueue.
20:26 <St^Ack> *
20:26 <St^Ack> * In case of error, all outstanding packets and moved from
20:26 <St^Ack> * ackQueue. A new pipeline is setup by eliminating the bad
20:26 <St^Ack> * datanode from the original pipeline. The DataStreamer now
20:26 <St^Ack> * starts sending packets from the dataQueue.
Here is maximum outstanding packets:
20:27 <St^Ack> private int maxPackets = 80; // each packet 64K, total 5MB
I looked at the ResponseProcessor and DataStreamer code -- no obvious big
stalls/sleeps.
> Uploads sometimes fall to 0 requests/second (Binding up on HLog#append?)
> ------------------------------------------------------------------------
>
> Key: HBASE-1394
> URL: https://issues.apache.org/jira/browse/HBASE-1394
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
>
> Trying to figure why rate sometimes goes to zero.
> Studying the reginoserver, HLog#append looks like a possible culprit.
> {code}
> "IPC Server handler 7 on 60021" daemon prio=10 tid=0x000000004057dc00
> nid=0x1bc4 in Object.wait() [0x0000000043393000..0x0000000043393b80]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2964)
> - locked <0x00007f9e3e449ff0> (a java.util.LinkedList)
> - locked <0x00007f9e3e449e18> (a
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> at
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
> at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
> at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
> - locked <0x00007f9e3e449e18> (a
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
> at java.io.DataOutputStream.write(DataOutputStream.java:90)
> - locked <0x00007f9e434e5588> (a
> org.apache.hadoop.fs.FSDataOutputStream)
> at
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1020)
> - locked <0x00007f9e434e55c0> (a
> org.apache.hadoop.io.SequenceFile$Writer)
> at
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:984)
> - locked <0x00007f9e434e55c0> (a
> org.apache.hadoop.io.SequenceFile$Writer)
> at org.apache.hadoop.hbase.regionserver.HLog.doWrite(HLog.java:565)
> at org.apache.hadoop.hbase.regionserver.HLog.append(HLog.java:521)
> - locked <0x00007f9dfa376f70> (a java.lang.Object)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.update(HRegion.java:1777)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1348)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1289)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1727)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:642)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:911)
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.