[
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894196#comment-16894196
]
Wei-Chiu Chuang commented on HADOOP-16403:
------------------------------------------
[~LiJinglun]
thanks for the update. It looks like the patch doesn't compile.
{noformat}
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR]
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestMetricLinkedBlockingQueue.java:[52,24]
error: queueTotal has private access in MetricLinkedBlockingQueue
[ERROR]
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestMetricLinkedBlockingQueue.java:[55,30]
error: queueTotal has private access in MetricLinkedBlockingQueue
[ERROR]
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestMetricLinkedBlockingQueue.java:[59,30]
error: queueTotal has private access in MetricLinkedBlockingQueue
[INFO] 3 errors
[INFO] -----------------
{noformat}
and checkstyle warnings
{noformat}
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:683:
}: 'method def rcurly' has incorrect indentation level 3, expected level
should be 2. [Indentation]
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/SwapQueueManager.java:194:
while (!queueIsReallyEmpty(oldQ)) {:39: Must have at least one statement.
[EmptyBlock]
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/protocolPB/RefreshCallQueueProtocolServerSideTranslatorPB.java:41:
VOID_REFRESH_READER_QUEUE_RESPONE = RefreshReaderQueueResponseProto:
'VOID_REFRESH_READER_QUEUE_RESPONE' has incorrect indentation level 2, expected
level should be 6. [Indentation]
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestCallQueueManager.java:172:
private static final Class<? extends BlockingQueue<FakeCall>> queueClass
=:65: Name 'queueClass' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
[ConstantName]
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestCallQueueManager.java:351:
exceptionClass = SwapQueueManager:7: Name 'exceptionClass' must match
pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'. [ConstantName]
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestMetricLinkedBlockingQueue.java:31:/**:
First sentence should end with a period. [JavadocStyle]
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:453:
"\t[-refreshReaderQueue]\n" +: '"\t[-refreshReaderQueue]\n"' has incorrect
indentation level 4, expected level should be 6. [Indentation]
./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/TestRefreshCallQueue.java:112:
/**: First sentence should end with a period. [JavadocStyle]
{noformat}
Additionally, and I forgot to mention, please add the description of the new
configuration keys into core-default.xml
(./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml)
You added an undocumented configuration ipc.<port>.readerqueue.impl. I am not
sure how to document this, but I would want to make easier for other folks to
use it without looking at source code. Like what Erik did in HADOOP-16097. Can
be a seperate jira, doesn't need to be included here. I just want to point out.
h1. MetricLinkedBlockingQueue
{code}
private int[] qps;
{code}
suggest to make it a final variable. Like:
{code}
private final int[] qps = new int[OP.values().length];
{code}
{code}
int updateQps(AtomicInteger ai) {
{code}
can be private method.
h1. SwapQueueManager class.
* Rename SwapQueueManager to SwappableQueueManager.
* Please update the javadoc of the class
*
> Start a new statistical rpc queue and make the Reader's pendingConnection
> queue runtime-replaceable
> ---------------------------------------------------------------------------------------------------
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: HADOOP-16403-How_MetricLinkedBlockingQueue_Works.pdf,
> HADOOP-16403.001.patch, HADOOP-16403.002.patch, HADOOP-16403.003.patch,
> MetricLinkedBlockingQueueTest.pdf
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so
> after the active dead, it takes the standby more than 40s to become active.
> Many requests(tcp connect request and rpc request) from Datanodes, clients
> and zkfc timed out and start retrying. The suddenly request flood lasts for
> the next 2 minutes and finally all requests are either handled or run out of
> retry times.
> Adjusting the rpc related settings might power the NameNode and solve this
> problem and the key point is finding the bottle neck. The rpc server can be
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect
> request. I think may be the reader queue is full and block the listener from
> handling new connections. Both slow handlers and slow readers can block the
> whole processing progress, and I need to know who it is. I think *a queue
> that computes the qps, write log when the queue is full and could be replaced
> easily* will help.
> I find the nice work HADOOP-10302 implementing a runtime-swapped queue.
> Using it at Reader's queue makes the reader queue runtime-swapped
> automatically. The qps computing job could be done by implementing a subclass
> of LinkedBlockQueue that does the computing job while put/take/... happens.
> The qps data will show on jmx.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]