[jira] [Commented] (HDFS-14660) [SBN Read] ObserverNameNode should throw StandbyException for requests not from ObserverProxyProvider

2019-07-19 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888931#comment-16888931
 ] 

Erik Krogen commented on HDFS-14660:


{quote}
In this case we just need to check whether the stateId is set or not (via 
RpcHeaderProtos#hasStateId which checks a special field).
{quote}
+1 to this [~csun], this seems like the cleanest way to check if it has been 
set. I forgot that protobuf provides this for us.

> [SBN Read] ObserverNameNode should throw StandbyException for requests not 
> from ObserverProxyProvider
> -
>
> Key: HDFS-14660
> URL: https://issues.apache.org/jira/browse/HDFS-14660
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>
> In a HDFS HA cluster with consistent reads enabled (HDFS-12943), clients 
> could be using either {{ObserverReadProxyProvider}}, 
> {{ConfiguredProxyProvider}}, or something else. Since observer is just a 
> special type of SBN and we allow transitions between them, a client NOT using 
> {{ObserverReadProxyProvider}} will need to have 
> {{dfs.ha.namenodes.}} include all NameNodes in the cluster, and 
> therefore, it may send request to a observer node.
> For this case, we should check whether the {{stateId}} in the incoming RPC 
> header is set or not, and throw an {{StandbyException}} when it is not. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14657) Refine NameSystem lock usage during processing FBR

2019-07-19 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888936#comment-16888936
 ] 

Erik Krogen commented on HDFS-14657:


This seems like a nice middle ground between the current behavior and 
HDFS-11313, which is a larger development effort. In one of our environments, 
we ended up having to make other unpleasant hacks to get around this issue in 
lieu of HDFS-11313 being completed.

I haven't yet thought deeply about the patch, but one thing stood out to me so 
far. Within {{BlockManager#processIncrementalBlockReport}}, previously we just 
confirmed that the lock was held, now we release the lock and re-acquire it 
(twice). IIRC, currently there is behavior within the IBR processing logic to 
batch many IBRs within the same write lock acquisition, to decrease the 
overhead of locking on each IBR. So before, we had something like one lock 
acquisition per 1000 IBRs, now we have 2 lock acquisitions per IBR. I'm 
wondering if this will introduce undesirable overheads?

> Refine NameSystem lock usage during processing FBR
> --
>
> Key: HDFS-14657
> URL: https://issues.apache.org/jira/browse/HDFS-14657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14657-001.patch
>
>
> The disk with 12TB capacity is very normal today, which means the FBR size is 
> much larger than before, Namenode holds the NameSystemLock during processing 
> block report for each storage, which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer 
> RPC queue time, which impacts client latency, so we did some simple work on 
> refining the lock usage, which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request 
> it again for every 5000 blocks(by default) during processing FBR, with the 
> fair lock, all the RPC request can be processed before BlockManager 
> re-acquire the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14509) DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 3.x

2019-07-19 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888943#comment-16888943
 ] 

Erik Krogen commented on HDFS-14509:


[~John Smith] and [~brahmareddy], thanks for looking into this. Is this 
targeted as a fix for the 2.x line which will make it possible to rolling 
upgrade to 3.x? If so, I would like to mark it as a blocker for a 2.10 release, 
as one of the goals of that release is to ensure a proper rolling upgrade path 
from 2.x to 3.x.

> DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 
> 3.x
> ---
>
> Key: HDFS-14509
> URL: https://issues.apache.org/jira/browse/HDFS-14509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuxuan Wang
>Priority: Major
> Attachments: HDFS-14509-001.patch
>
>
> According to the doc, if we want to upgrade cluster from 2.x to 3.x, we need 
> upgrade NN first. And there will be a intermediate state that NN is 3.x and 
> DN is 2.x. At that moment, if a client reads (or writes) a block, it will get 
> a block token from NN and then deliver the token to DN who can verify the 
> token. But the verification in the code now is :
> {code:title=BlockTokenSecretManager.java|borderStyle=solid}
> public void checkAccess(...)
> {
> ...
> id.readFields(new DataInputStream(new 
> ByteArrayInputStream(token.getIdentifier(;
> ...
> if (!Arrays.equals(retrievePassword(id), token.getPassword())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have the correct token password");
> }
> }
> {code} 
> And {{retrievePassword(id)}} is:
> {code} 
> public byte[] retrievePassword(BlockTokenIdentifier identifier)
> {
> ...
> return createPassword(identifier.getBytes(), key.getKey());
> }
> {code} 
> So, if NN's identifier add new fields, DN will lose the fields and compute 
> wrong password.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14655) SBN : Namenode crashes if one of The JN is down

2019-07-22 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890513#comment-16890513
 ] 

Erik Krogen commented on HDFS-14655:


Great discussion here. [~ayushtkn], particularly good call on the issue that 
cancelling is not fully sufficient to fix this issue.

I agree that calling cancel + limiting the size of the {{parallelExecutor}} 
seems to be a good approach. That executor is scoped to a single JN, so a limit 
will not affect other JNs if one is running slowly. Plus, the 
{{parallelExecutor}} is only used by {{getJournaledEdits}} and 
{{getEditLogManifest}} (others use the {{singleThreadExecutor}}) so no other 
operations besides edit log tailing should be affected. It seems we'll need to 
use {{new ThreadPoolExecutor()}} directly instead of the {{Executors}} 
convenience method.

You said that many {{InterruptedException}} instances are being logged, is 
there any way we can suppress them? Where are they logged from?

> SBN : Namenode crashes if one of The JN is down
> ---
>
> Key: HDFS-14655
> URL: https://issues.apache.org/jira/browse/HDFS-14655
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Priority: Major
> Attachments: HDFS-14655.poc.patch
>
>
> {noformat}
> 2019-07-04 17:35:54,064 | INFO  | Logger channel (from parallel executor) to 
> XXX/XXX | Retrying connect to server: XXX/XXX. Already tried 
> 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS) | Client.java:975
> 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered 
> while tailing edits. Shutting down standby NN. | EditLogTailer.java:474
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:717)
>   at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
>   at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440)
>   at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>   at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
> 2019-07-04 17:35:54,112 | INFO  | Edit log tailer | Exiting with status 1: 
> java.lang.OutOfMemoryError: unable to create new native thread | 
> ExitUtil.java:210
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14655) SBN : Namenode crashes if one of The JN is down

2019-07-22 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-14655:
--

Assignee: Ayush Saxena

> SBN : Namenode crashes if one of The JN is down
> ---
>
> Key: HDFS-14655
> URL: https://issues.apache.org/jira/browse/HDFS-14655
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14655.poc.patch
>
>
> {noformat}
> 2019-07-04 17:35:54,064 | INFO  | Logger channel (from parallel executor) to 
> XXX/XXX | Retrying connect to server: XXX/XXX. Already tried 
> 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS) | Client.java:975
> 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered 
> while tailing edits. Shutting down standby NN. | EditLogTailer.java:474
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:717)
>   at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
>   at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440)
>   at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>   at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
> 2019-07-04 17:35:54,112 | INFO  | Edit log tailer | Exiting with status 1: 
> java.lang.OutOfMemoryError: unable to create new native thread | 
> ExitUtil.java:210
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14655) SBN : Namenode crashes if one of The JN is down

2019-07-22 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890513#comment-16890513
 ] 

Erik Krogen edited comment on HDFS-14655 at 7/22/19 10:37 PM:
--

Great discussion here. [~ayushtkn], particularly good call on the issue that 
cancelling is not fully sufficient to fix this issue.

I agree that calling cancel + limiting the size of the {{parallelExecutor}} 
seems to be a good approach. That executor is scoped to a single JN, so a limit 
will not affect other JNs if one is running slowly. Plus, the 
{{parallelExecutor}} is only used by {{getJournaledEdits}} and 
{{getEditLogManifest}} (others use the {{singleThreadExecutor}}) so no other 
operations besides edit log tailing should be affected. It seems we'll need to 
use {{new ThreadPoolExecutor()}} directly instead of the {{Executors}} 
convenience method.

You said that many {{InterruptedException}} instances are being logged, is 
there any way we can suppress them? Where are they logged from?

edit: [~ayushtkn], I am assigning to you for now since you seem to be driving 
the effort


was (Author: xkrogen):
Great discussion here. [~ayushtkn], particularly good call on the issue that 
cancelling is not fully sufficient to fix this issue.

I agree that calling cancel + limiting the size of the {{parallelExecutor}} 
seems to be a good approach. That executor is scoped to a single JN, so a limit 
will not affect other JNs if one is running slowly. Plus, the 
{{parallelExecutor}} is only used by {{getJournaledEdits}} and 
{{getEditLogManifest}} (others use the {{singleThreadExecutor}}) so no other 
operations besides edit log tailing should be affected. It seems we'll need to 
use {{new ThreadPoolExecutor()}} directly instead of the {{Executors}} 
convenience method.

You said that many {{InterruptedException}} instances are being logged, is 
there any way we can suppress them? Where are they logged from?

> SBN : Namenode crashes if one of The JN is down
> ---
>
> Key: HDFS-14655
> URL: https://issues.apache.org/jira/browse/HDFS-14655
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14655.poc.patch
>
>
> {noformat}
> 2019-07-04 17:35:54,064 | INFO  | Logger channel (from parallel executor) to 
> XXX/XXX | Retrying connect to server: XXX/XXX. Already tried 
> 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS) | Client.java:975
> 2019-07-04 17:35:54,087 | FATAL | Edit log tailer | Unknown error encountered 
> while tailing edits. Shutting down standby NN. | EditLogTailer.java:474
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:717)
>   at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
>   at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440)
>   at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.getJournaledEdits(IPCLoggerChannel.java:565)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.getJournaledEdits(AsyncLoggerSet.java:272)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectRpcInputStreams(QuorumJournalManager.java:533)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:508)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:275)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1681)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:307)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:410)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:17

[jira] [Commented] (HDFS-14509) DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 3.x

2019-07-23 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891108#comment-16891108
 ] 

Erik Krogen commented on HDFS-14509:


Great, I will mark them all as blockers for a 2.10 release. Thanks for 
confirming [~John Smith].

> DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 
> 3.x
> ---
>
> Key: HDFS-14509
> URL: https://issues.apache.org/jira/browse/HDFS-14509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuxuan Wang
>Priority: Major
> Attachments: HDFS-14509-001.patch
>
>
> According to the doc, if we want to upgrade cluster from 2.x to 3.x, we need 
> upgrade NN first. And there will be a intermediate state that NN is 3.x and 
> DN is 2.x. At that moment, if a client reads (or writes) a block, it will get 
> a block token from NN and then deliver the token to DN who can verify the 
> token. But the verification in the code now is :
> {code:title=BlockTokenSecretManager.java|borderStyle=solid}
> public void checkAccess(...)
> {
> ...
> id.readFields(new DataInputStream(new 
> ByteArrayInputStream(token.getIdentifier(;
> ...
> if (!Arrays.equals(retrievePassword(id), token.getPassword())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have the correct token password");
> }
> }
> {code} 
> And {{retrievePassword(id)}} is:
> {code} 
> public byte[] retrievePassword(BlockTokenIdentifier identifier)
> {
> ...
> return createPassword(identifier.getBytes(), key.getKey());
> }
> {code} 
> So, if NN's identifier add new fields, DN will lose the fields and compute 
> wrong password.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14551) NN throws NPE if downgrade it during rolling upgrade from 3.x to 2.x

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14551:
---
Target Version/s: 2.10.0
Priority: Blocker  (was: Major)

> NN throws NPE if downgrade it during rolling upgrade from 3.x to 2.x
> 
>
> Key: HDFS-14551
> URL: https://issues.apache.org/jira/browse/HDFS-14551
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuxuan Wang
>Priority: Blocker
>
> We can downgrade NN during roling upgrade (runned "hdfs dfsadmin 
> -rollingUpgrade prepare) with HDFS-8432 involved. But with HDFS-14172 if the 
> image has any unrecogized section, it will throw IOException at
> {code:title=FSImageFormatProtobuf.java|borderStyle=solid}
> private void loadInternal(..) {
> ..
> String n = s.getName();
> SectionName sectionName = SectionName.fromString(n);
> if (sectionName == null) {
>   throw new IOException("Unrecognized section " + n);
> }
> ..
> }
> {code}
> and throw NPE on Hadoop 2.x
> {code:title=FSImageFormatProtobuf.java|borderStyle=solid}
> private void loadInternal(..) {
> ..
> String n = s.getName();
> switch (sectionName)
> ..
> }
> {code}
> When we downgrade NN from 3.x to 2.x, NN may load the image saved by 3.x NN. 
> Then the lack of {{SectionName.ERASURE_CODING}} can break 2.x NN.
> We should just skip the unrecogized section instead of throwing exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13596:
---
Target Version/s: 2.10.0

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Blocker
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  a

[jira] [Updated] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13596:
---
Priority: Blocker  (was: Critical)

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Blocker
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch, HDFS-13596.008.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java

[jira] [Updated] (HDFS-14509) DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 3.x

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14509:
---
Target Version/s: 2.10.0
Priority: Blocker  (was: Major)

> DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 
> 3.x
> ---
>
> Key: HDFS-14509
> URL: https://issues.apache.org/jira/browse/HDFS-14509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuxuan Wang
>Priority: Blocker
> Attachments: HDFS-14509-001.patch
>
>
> According to the doc, if we want to upgrade cluster from 2.x to 3.x, we need 
> upgrade NN first. And there will be a intermediate state that NN is 3.x and 
> DN is 2.x. At that moment, if a client reads (or writes) a block, it will get 
> a block token from NN and then deliver the token to DN who can verify the 
> token. But the verification in the code now is :
> {code:title=BlockTokenSecretManager.java|borderStyle=solid}
> public void checkAccess(...)
> {
> ...
> id.readFields(new DataInputStream(new 
> ByteArrayInputStream(token.getIdentifier(;
> ...
> if (!Arrays.equals(retrievePassword(id), token.getPassword())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have the correct token password");
> }
> }
> {code} 
> And {{retrievePassword(id)}} is:
> {code} 
> public byte[] retrievePassword(BlockTokenIdentifier identifier)
> {
> ...
> return createPassword(identifier.getBytes(), key.getKey());
> }
> {code} 
> So, if NN's identifier add new fields, DN will lose the fields and compute 
> wrong password.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14462:
---
Summary: WebHDFS throws "Error writing request body to server" instead of 
DSQuotaExceededException  (was: WebHDFS throws "Error writing request body to 
server" instead of NSQuotaExceededException)

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Priority: Major
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-07-23 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-14462:
--

Assignee: Simbarashe Dzinamarira

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891945#comment-16891945
 ] 

Erik Krogen commented on HDFS-14034:


I'm not familiar with HttpFS so I would prefer if [~jojochuang] reviewed that 
portion, if he doesn't have time I can dive more into that section. For now 
this review applies only to the WebHDFS portion of the patch:
* In {{JsonUtil#toJsonString(ContentSummary)}}, instead of individually adding 
each element of {{quotaMap}} to {{m}}, would it be cleaner to have a flag 
{{includeFileAndDirectoryCount}} in the {{toJsonMap(QuotaUsage)}} method? Then 
{{toJsonString}} can just use {{m.addAll(toJsonMap(contentsummary, false))}}
* {code}
final Map m = toJsonMap(quotaUsage);
return toJsonString(QuotaUsage.class, m);
{code}
The local variable {{m}} seems a little redundant here, should we just do:
{code}return toJsonString(QuotaUsage.class, toJsonMap(quotaUsage){code}
* In {{TestWebHDFS#testQuotaUsage}}, rather than using {{writeUTF()}} and then 
having some arbitrary number of bytes result (39), can we do something like 
{{write(byte[])}} with a fixed size to make it more clear where this number 
comes from? It would also be helpful to use local variables to store numbers 
like {{600 * 1024 * 1024}} and {{10}} to make the relationship clear.

Otherwise the WebHDFS portions look great. Thanks a lot for handling this, 
[~csun].

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891964#comment-16891964
 ] 

Erik Krogen commented on HDFS-13783:


Hi [~zhangchen], things are looking good!
{quote}The failed test is not related with this patch, all these tests can pass 
in local
{quote}
I don't believe you :) Take a look at the output for {{TestHdfsConfigFields}}:
{code:java}
missing in hdfs-default.xml Entries:   dfs.balancer.service.interval.seconds  
dfs.balancer.service.retry.on.exception expected:<0> but was:<2>
{code}
Please also tackle the outstanding checkstyle issues (besides line length 
warnings in {{DFSConfigKeys}}).

Moving on to my review:
 * The variable {{running}} seems to be intended to be called from a 
multi-threaded context, can you please make it {{volatile}}?
 * In the new {{run()}} method, I think we could make the flow of control more 
clear by avoiding all of the other logic when {{runAsService}} is false:
{code:java}
if (!p.getRunAsService()) {
  return doBalance(namenodes, p, conf);
}
{code}
Then, the rest of the method can avoid logic handling the non-service scenario. 
The {{Cli#run()}} method will handle exception printing and such.

 * You have a typo: {{scheduleInteral}} -> {{scheduleInterval}}
 * I think it's misleading that {{tried = 1}} before any attempts have been 
made. Can we make it so that {{tried = 0}} at the beginning? We can change the 
check in the catch block to be {{tried >= retryOnException}} or {{++tried > 
retryOnException}}
 * Please use the slf4j style loggers, for example, instead of:
{code:java}
LOG.warn("Encounter exception while do balance work: " + e
+ " already tried " + tried + " times");
{code}
use:
{code:java}
LOG.warn("Encounter exception while do balance work. Already tried {} 
times", tried, e);
{code}

 * For {{DFS_BALANCER_SERVICE_INTERVAL_SECONDS}}, now that we use 
{{Configuration#getTimeDuration()}}, we don't need to specify units as the 
user/administrator can do so themselves. I think we can rename it to 
{{DFS_BALANCER_SERVICE_INTERVAL}} / {{dfs.balancer.service.interval}}
 * Can we rename {{DFS_BALANCER_SERVICE_RETRY_ON_EXCEPTION}} to 
{{DFS_BALANCER_RETRIES_ON_EXCEPTION}} (and the same change in the config key)? 
"retry on exception", to me, sounds like it would be a boolean flag of whether 
to retry at all. Retries makes it sound more like a number.
 * When you load {{scheduleInterval}}, you get it in seconds, but when you use 
it (for {{Thread#sleep()}}), it needs to be milliseconds. Why not just load it 
as milliseconds from the beginning:
{code:java}
  scheduleInterval = conf.getTimeDuration(
  DFSConfigKeys.DFS_BALANCER_SERVICE_INTERVAL_SECONDS_KEY,
  DFSConfigKeys.DFS_BALANCER_SERVICE_INTERVAL_SECONDS_DEFAULT,
  TimeUnit.SECONDS, TimeUnit.MILLISECONDS);
{code}

 * {{Balancer}} has a convenient {{time2Str(long)}} method, can we use that for 
the "Finished one round, will wait for ..." log statement?
 * It seems that {{tried}} is never reset. I think we should reset it back to 0 
whenever we have a successful balance; it doesn't seem right for the service to 
have a limited number of retries over its entire lifetime.
 * It looks like {{TestBalancerService#removeExtraBlocks()}} is not used 
anywhere, can we remove it?
 * I think the {{setupCluster}} / {{addOneDataNode}} / {{newBalancerService}} 
methods should all be private?
 * Why do we have {{Thread.sleep(500)}} on {{TestBalancerService}} L71? If 
you're waiting for the NN to become active, we should use 
{{cluster.waitActive(0)}} to avoid test flakiness. I'm also worried about the 
{{Thread.sleep(5000)}} call on L139. Is there any thing we can replace this 
with which will be less flaky?
 * I think we should be using {{finally}} blocks to close the {{cluster}} you 
create in the tests.
 * {{args}} created on {{TestBalancerService}} L159 is unused.
 * For {{Thread.sleep(10)}} on {{TestBalancerService}} L170, the comment says 
10 seconds, but the code says 10 milliseconds.

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a 

[jira] [Comment Edited] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891964#comment-16891964
 ] 

Erik Krogen edited comment on HDFS-13783 at 7/24/19 4:06 PM:
-

Hi [~zhangchen], things are looking good!
{quote}The failed test is not related with this patch, all these tests can pass 
in local
{quote}
I don't believe you :) Take a look at the output for {{TestHdfsConfigFields}}:
{code:java}
missing in hdfs-default.xml Entries:   dfs.balancer.service.interval.seconds  
dfs.balancer.service.retry.on.exception expected:<0> but was:<2>
{code}
Please also tackle the outstanding checkstyle issues (besides line length 
warnings in {{DFSConfigKeys}}).

Moving on to my review:
# The variable {{running}} seems to be intended to be called from a 
multi-threaded context, can you please make it {{volatile}}?
# In the new {{run()}} method, I think we could make the flow of control more 
clear by avoiding all of the other logic when {{runAsService}} is false:
{code:java}
if (!p.getRunAsService()) {
  return doBalance(namenodes, p, conf);
}
{code}
Then, the rest of the method can avoid logic handling the non-service scenario. 
The {{Cli#run()}} method will handle exception printing and such.
# You have a typo: {{scheduleInteral}} -> {{scheduleInterval}}
# I think it's misleading that {{tried = 1}} before any attempts have been 
made. Can we make it so that {{tried = 0}} at the beginning? We can change the 
check in the catch block to be {{tried >= retryOnException}} or {{++tried > 
retryOnException}}
# Please use the slf4j style loggers, for example, instead of:
{code:java}
LOG.warn("Encounter exception while do balance work: " + e
+ " already tried " + tried + " times");
{code}
use:
{code:java}
LOG.warn("Encounter exception while do balance work. Already tried {} 
times", tried, e);
{code}
# For {{DFS_BALANCER_SERVICE_INTERVAL_SECONDS}}, now that we use 
{{Configuration#getTimeDuration()}}, we don't need to specify units as the 
user/administrator can do so themselves. I think we can rename it to 
{{DFS_BALANCER_SERVICE_INTERVAL}} / {{dfs.balancer.service.interval}}
# Can we rename {{DFS_BALANCER_SERVICE_RETRY_ON_EXCEPTION}} to 
{{DFS_BALANCER_RETRIES_ON_EXCEPTION}} (and the same change in the config key)? 
"retry on exception", to me, sounds like it would be a boolean flag of whether 
to retry at all. Retries makes it sound more like a number.
# When you load {{scheduleInterval}}, you get it in seconds, but when you use 
it (for {{Thread#sleep()}}), it needs to be milliseconds. Why not just load it 
as milliseconds from the beginning:
{code:java}
  scheduleInterval = conf.getTimeDuration(
  DFSConfigKeys.DFS_BALANCER_SERVICE_INTERVAL_SECONDS_KEY,
  DFSConfigKeys.DFS_BALANCER_SERVICE_INTERVAL_SECONDS_DEFAULT,
  TimeUnit.SECONDS, TimeUnit.MILLISECONDS);
{code}
# {{Balancer}} has a convenient {{time2Str(long)}} method, can we use that for 
the "Finished one round, will wait for ..." log statement?
# It seems that {{tried}} is never reset. I think we should reset it back to 0 
whenever we have a successful balance; it doesn't seem right for the service to 
have a limited number of retries over its entire lifetime.
# It looks like {{TestBalancerService#removeExtraBlocks()}} is not used 
anywhere, can we remove it?
# I think the {{setupCluster}} / {{addOneDataNode}} / {{newBalancerService}} 
methods should all be private?
# Why do we have {{Thread.sleep(500)}} on {{TestBalancerService}} L71? If 
you're waiting for the NN to become active, we should use 
{{cluster.waitActive(0)}} to avoid test flakiness. I'm also worried about the 
{{Thread.sleep(5000)}} call on L139. Is there any thing we can replace this 
with which will be less flaky?
# I think we should be using {{finally}} blocks to close the {{cluster}} you 
create in the tests.
# {{args}} created on {{TestBalancerService}} L159 is unused.
# For {{Thread.sleep(10)}} on {{TestBalancerService}} L170, the comment says 10 
seconds, but the code says 10 milliseconds.


was (Author: xkrogen):
Hi [~zhangchen], things are looking good!
{quote}The failed test is not related with this patch, all these tests can pass 
in local
{quote}
I don't believe you :) Take a look at the output for {{TestHdfsConfigFields}}:
{code:java}
missing in hdfs-default.xml Entries:   dfs.balancer.service.interval.seconds  
dfs.balancer.service.retry.on.exception expected:<0> but was:<2>
{code}
Please also tackle the outstanding checkstyle issues (besides line length 
warnings in {{DFSConfigKeys}}).

Moving on to my review:
 * The variable {{running}} seems to be intended to be called from a 
multi-threaded context, can you please make it {{volatile}}?
 * In the new {{run()}} method, I think we could make the flow of control more 
clear by avoiding all of the other logic when {{runAsService}} is false:
{code:java}

[jira] [Commented] (HDFS-14657) Refine NameSystem lock usage during processing FBR

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891969#comment-16891969
 ] 

Erik Krogen commented on HDFS-14657:


Hey [~zhangchen], your idea expressed in the v2 patch seems sound to me. But I 
must admit to not be deeply familiar with the block report process or what 
invariants must be upheld by the locking scheme. Hopefully folks like [~shv], 
[~jojochuang], [~kihwal] or [~daryn] can take a look as well.

> Refine NameSystem lock usage during processing FBR
> --
>
> Key: HDFS-14657
> URL: https://issues.apache.org/jira/browse/HDFS-14657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14657-001.patch, HDFS-14657.002.patch
>
>
> The disk with 12TB capacity is very normal today, which means the FBR size is 
> much larger than before, Namenode holds the NameSystemLock during processing 
> block report for each storage, which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer 
> RPC queue time, which impacts client latency, so we did some simple work on 
> refining the lock usage, which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request 
> it again for every 5000 blocks(by default) during processing FBR, with the 
> fair lock, all the RPC request can be processed before BlockManager 
> re-acquire the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12342) Differentiate webhdfs vs. swebhdfs calls in audit log

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-12342:
--

Assignee: Chen Liang  (was: Erik Krogen)

> Differentiate webhdfs vs. swebhdfs calls in audit log
> -
>
> Key: HDFS-12342
> URL: https://issues.apache.org/jira/browse/HDFS-12342
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: logging
>Reporter: Erik Krogen
>Assignee: Chen Liang
>Priority: Major
>  Labels: Incompatible
>
> Currently the audit log only logs {{webhdfs}} vs {{rpc}} as the {{proto}}. It 
> is useful to be able to audit whether certain commands were carried out via 
> webhdfs or swebhdfs as this has different security and potentially 
> performance implications. We have been running this internally for a while 
> and have found it useful for looking at usage patterns.
> Proposal is just to continue logging {{webhdfs}} as the proto for {{http}} 
> WebHDFS commands, but log {{swebhdfs}} for SWebHDFS (over {{https}}). This 
> will be incompatible.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14667) Backport [HDFS-14403] "Cost-based FairCallQueue" to branch-2

2019-07-24 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-14667:
--

 Summary: Backport [HDFS-14403] "Cost-based FairCallQueue" to 
branch-2
 Key: HDFS-14667
 URL: https://issues.apache.org/jira/browse/HDFS-14667
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Erik Krogen
Assignee: Erik Krogen


We would like to target pulling HDFS-14403, an important operability 
enhancement, into branch-2.





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14667) Backport [HDFS-14403] "Cost-based FairCallQueue" to branch-2

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14667:
---
Description: 
We would like to target pulling HDFS-14403, an important operability 
enhancement, into branch-2.


It's only present in trunk now so we also need to backport through the 3.x 
lines.


  was:
We would like to target pulling HDFS-14403, an important operability 
enhancement, into branch-2.




> Backport [HDFS-14403] "Cost-based FairCallQueue" to branch-2
> 
>
> Key: HDFS-14667
> URL: https://issues.apache.org/jira/browse/HDFS-14667
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> We would like to target pulling HDFS-14403, an important operability 
> enhancement, into branch-2.
> It's only present in trunk now so we also need to backport through the 3.x 
> lines.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14276) [SBN read] Reduce tailing overhead

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892239#comment-16892239
 ] 

Erik Krogen commented on HDFS-14276:


[~ayushtkn] or [~jojochuang] are either of you interested in helping to finish 
this off by fixing the failing test? I am happy to help review.

> [SBN read] Reduce tailing overhead
> --
>
> Key: HDFS-14276
> URL: https://issues.apache.org/jira/browse/HDFS-14276
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Affects Versions: 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption.
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-14276.000.patch, Screen Shot 2019-02-12 at 10.51.41 
> PM.png, Screen Shot 2019-02-14 at 11.50.37 AM.png
>
>
> When Observer sets {{dfs.ha.tail-edits.period}} = {{0ms}}, it tails edit log 
> continuously in order to fetch the latest edits, but there is a lot of 
> overhead in doing so.
> Critically, edit log tailer should _not_ update NameDirSize metric every 
> time. It has nothing to do with fetching edits, and it involves lots of 
> directory space calculation.
> Profiler suggests a non-trivial chunk of time is spent for nothing.
> Other than this, the biggest overhead is in the communication to 
> serialize/deserialize messages to/from JNs. I am looking for ways to reduce 
> the cost because it's burning 30% of my CPU time even when the cluster is 
> idle.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14276) [SBN read] Reduce tailing overhead

2019-07-24 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892239#comment-16892239
 ] 

Erik Krogen edited comment on HDFS-14276 at 7/24/19 10:24 PM:
--

[~ayushtkn] or [~jojochuang] are either of you interested in helping to finish 
this off by fixing the failing test? I am happy to help review but am going to 
focus my efforts on HDFS-14370 for now.


was (Author: xkrogen):
[~ayushtkn] or [~jojochuang] are either of you interested in helping to finish 
this off by fixing the failing test? I am happy to help review.

> [SBN read] Reduce tailing overhead
> --
>
> Key: HDFS-14276
> URL: https://issues.apache.org/jira/browse/HDFS-14276
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Affects Versions: 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption.
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-14276.000.patch, Screen Shot 2019-02-12 at 10.51.41 
> PM.png, Screen Shot 2019-02-14 at 11.50.37 AM.png
>
>
> When Observer sets {{dfs.ha.tail-edits.period}} = {{0ms}}, it tails edit log 
> continuously in order to fetch the latest edits, but there is a lot of 
> overhead in doing so.
> Critically, edit log tailer should _not_ update NameDirSize metric every 
> time. It has nothing to do with fetching edits, and it involves lots of 
> directory space calculation.
> Profiler suggests a non-trivial chunk of time is spent for nothing.
> Other than this, the biggest overhead is in the communication to 
> serialize/deserialize messages to/from JNs. I am looking for ways to reduce 
> the cost because it's burning 30% of my CPU time even when the cluster is 
> idle.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.000.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: (was: HDFS-14370.000.patch)

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Status: Patch Available  (was: Open)

I just attached a v000 patch for this. It doesn't have a test yet, but it 
demonstrates the idea:
* Add a new config for the _maximum_ sleep time allowable by the edit log 
tailer. Keep the existing config as the minimum / default sleep time.
* When 0 edits are returned, multiply the current sleep time by 2x each time, 
creating exponential backoff. Limit the current sleep time by the maximum 
defined above.
* By default, set both of the properties to be equal, thus keeping the same 
behavior as today.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-24 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.000.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-07-25 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893021#comment-16893021
 ] 

Erik Krogen commented on HDFS-14462:


Looks like when I thought I verified that it exists in trunk, I was wrong. 
Thanks [~simbadzina] for figuring this out.

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-25 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893110#comment-16893110
 ] 

Erik Krogen commented on HDFS-12703:


Hey folks, this seems like an issue with pretty severe negative consequences in 
some scenarios, should be backport to older branches (branch-2) as well? 
[~hexiaoqiao] [~xuel1] let me know if you're interested.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, 
> HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, 
> HDFS-12703.012.patch, HDFS-12703.013.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-25 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893146#comment-16893146
 ] 

Erik Krogen commented on HDFS-13783:


[~zhangchen], thanks for the changes, it's looking really good. Besides fixing 
the checkstlye warnings reported by Jenkins, I have a few small comments:
* You still have this typo: {{scheduleInteral}} -> {{scheduleInterval}}
* I don't really think we should catch {{InterruptedException}} here:
{code}
Thread.sleep(scheduleInteval);
  } catch (InterruptedException ie) {
if (++tried > retryOnException) {
  throw ie;
}
{code}
If it was interrupted, we should probably respect that and exit.
* Within {{hdfs-default.xml}}, the description has a few spelling and grammar 
errors, I think it should say:
{quote}
When the balancer is executed as a long-running service, it will retry upon 
encountering an exception. This configuration determines how many times it will 
retry before considering the exception to be fatal and quitting.
{quote}
* In {{testBalancerServiceOnError}}, maybe we can use 
{{GenericTestUtils.LogCapturer}} to verify that the Balancer service actually 
had to retry? Or, add a {{retryCount}} variable that tracks the number of 
retries that were necessary (this could be useful for HDFS-10648 as well). If 
this ends up being too difficult I'm okay with leaving it as-is.

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-26 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893906#comment-16893906
 ] 

Erik Krogen commented on HDFS-14135:


[~iwasakims] and [~ayushtkn], this broke the branch-2 build because it uses 
lambdas. I just reverted it.

> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-26 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14135:
---
Fix Version/s: (was: 2.10.0)

> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-26 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893911#comment-16893911
 ] 

Erik Krogen commented on HDFS-14672:


HDFS-14135 broke the build; I just reverted it, so it should work now. 

{quote}
Please let me know which branch-2.* do you want to backport?
{quote}
Personally I only have interest in branch-2, but maybe it makes sense to do 2.8 
and 2.9 as well. If it applies cleanly I see no reason not to.

> Backport HDFS-12703 to branch-2
> ---
>
> Key: HDFS-14672
> URL: https://issues.apache.org/jira/browse/HDFS-14672
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12703.branch-2.001.patch
>
>
> Currently, `decommission monitor exception cause namenode fatal` is only in 
> trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-26 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893983#comment-16893983
 ] 

Erik Krogen commented on HDFS-13783:


It looks like {{TestDirectoryScanner}} is already tracked in HDFS-14669.

The new methods added for monitoring look great! I think we can actually 
leverage this to make the test more robust and run faster:
{code}
  Thread.sleep(1);
  assertTrue(Balancer.getExceptionsSinceLastBalance() > 0);
{code}
we should be able to replace this with:
{code}
GenericTestUtils.waitFor(() -> Balancer.getExceptionsSinceLastBalance() > 0, 
1000, 2);
{code}
Also, the newly added fields should probably be {{volatile}} since they may be 
accessed from a thread besides the one that is doing the updating.

Other than this, we still need to fix the checkstyle. I usually use the diff 
report produced by Jenkins:
{code}
checkstyle  
https://builds.apache.org/job/PreCommit-HDFS-Build/27297/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
{code}
This shows only the diff and not any existing issues. You can also do this 
yourself using the 
[{{test-patch}}|https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute#HowToContribute-Testingyourpatch]
 script included in the repo.

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch, HDFS-13783.005.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-26 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894049#comment-16894049
 ] 

Erik Krogen commented on HDFS-14135:


Actually this broke all branches besides trunk. {{AssumptionViolatedException}} 
was used but this is only present in [JUnit 
4.12|https://junit.org/junit4/javadoc/4.12/org/junit/AssumptionViolatedException.html]
 and older branches are on 4.11.  I reverted from all branches besides trunk 
(branch-3.2 and branch-3.1)

> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14135) TestWebHdfsTimeouts Fails intermittently in trunk

2019-07-26 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14135:
---
Fix Version/s: (was: 3.1.3)
   (was: 3.2.1)

> TestWebHdfsTimeouts Fails intermittently in trunk
> -
>
> Key: HDFS-14135
> URL: https://issues.apache.org/jira/browse/HDFS-14135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14135-01.patch, HDFS-14135-02.patch, 
> HDFS-14135-03.patch, HDFS-14135-04.patch, HDFS-14135-05.patch, 
> HDFS-14135-06.patch, HDFS-14135-07.patch, HDFS-14135-08.patch, 
> HDFS-14135.009.patch, HDFS-14135.010.patch, HDFS-14135.011.patch, 
> HDFS-14135.012.patch, HDFS-14135.013.patch
>
>
> Reference to failure
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/982/testReport/junit/org.apache.hadoop.hdfs.web/TestWebHdfsTimeouts/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-26 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894132#comment-16894132
 ] 

Erik Krogen commented on HDFS-14672:


[~elgoiri] I wonder if you're interested in helping to review this branch-2 
backport since you did the trunk review? I'm happy to do the work of getting it 
in the branches. Let me know if you don't have time and I'll try to understand 
the patch next week.

> Backport HDFS-12703 to branch-2
> ---
>
> Key: HDFS-14672
> URL: https://issues.apache.org/jira/browse/HDFS-14672
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12703.branch-2.001.patch
>
>
> Currently, `decommission monitor exception cause namenode fatal` is only in 
> trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14639) [Dynamometer] Unnecessary duplicate bin directory appears in dist layout

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895574#comment-16895574
 ] 

Erik Krogen edited comment on HDFS-14639 at 7/29/19 8:50 PM:
-

Thanks [~jojochuang]! I just committed this to trunk.


was (Author: xkrogen):
Thanks [~jojochuang]!

> [Dynamometer] Unnecessary duplicate bin directory appears in dist layout
> 
>
> Key: HDFS-14639
> URL: https://issues.apache.org/jira/browse/HDFS-14639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14639.000.patch
>
>
> The bin files get put into the 
> {{share/hadoop/tools/dynamometer/dynamometer-*/bin}} locations as expected:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer/dynamometer-*/bin
> share/hadoop/tools/dynamometer/dynamometer-blockgen/bin:
> generate-block-lists.sh
> share/hadoop/tools/dynamometer/dynamometer-infra/bin:
> create-slim-hadoop-tar.shparse-metrics.sh 
> start-dynamometer-cluster.sh upload-fsimage.sh
> share/hadoop/tools/dynamometer/dynamometer-workload/bin:
> parse-start-timestamp.sh start-workload.sh
> {code}
> But for blockgen specifically, it also ends up in another folder:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer-blockgen/bin
> generate-block-lists.sh
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14639) [Dynamometer] Unnecessary duplicate bin directory appears in dist layout

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895574#comment-16895574
 ] 

Erik Krogen commented on HDFS-14639:


Thanks [~jojochuang]!

> [Dynamometer] Unnecessary duplicate bin directory appears in dist layout
> 
>
> Key: HDFS-14639
> URL: https://issues.apache.org/jira/browse/HDFS-14639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14639.000.patch
>
>
> The bin files get put into the 
> {{share/hadoop/tools/dynamometer/dynamometer-*/bin}} locations as expected:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer/dynamometer-*/bin
> share/hadoop/tools/dynamometer/dynamometer-blockgen/bin:
> generate-block-lists.sh
> share/hadoop/tools/dynamometer/dynamometer-infra/bin:
> create-slim-hadoop-tar.shparse-metrics.sh 
> start-dynamometer-cluster.sh upload-fsimage.sh
> share/hadoop/tools/dynamometer/dynamometer-workload/bin:
> parse-start-timestamp.sh start-workload.sh
> {code}
> But for blockgen specifically, it also ends up in another folder:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer-blockgen/bin
> generate-block-lists.sh
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14639) [Dynamometer] Unnecessary duplicate bin directory appears in dist layout

2019-07-29 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14639:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [Dynamometer] Unnecessary duplicate bin directory appears in dist layout
> 
>
> Key: HDFS-14639
> URL: https://issues.apache.org/jira/browse/HDFS-14639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14639.000.patch
>
>
> The bin files get put into the 
> {{share/hadoop/tools/dynamometer/dynamometer-*/bin}} locations as expected:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer/dynamometer-*/bin
> share/hadoop/tools/dynamometer/dynamometer-blockgen/bin:
> generate-block-lists.sh
> share/hadoop/tools/dynamometer/dynamometer-infra/bin:
> create-slim-hadoop-tar.shparse-metrics.sh 
> start-dynamometer-cluster.sh upload-fsimage.sh
> share/hadoop/tools/dynamometer/dynamometer-workload/bin:
> parse-start-timestamp.sh start-workload.sh
> {code}
> But for blockgen specifically, it also ends up in another folder:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer-blockgen/bin
> generate-block-lists.sh
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14639) [Dynamometer] Unnecessary duplicate bin directory appears in dist layout

2019-07-29 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14639:
---
Fix Version/s: 3.3.0

> [Dynamometer] Unnecessary duplicate bin directory appears in dist layout
> 
>
> Key: HDFS-14639
> URL: https://issues.apache.org/jira/browse/HDFS-14639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14639.000.patch
>
>
> The bin files get put into the 
> {{share/hadoop/tools/dynamometer/dynamometer-*/bin}} locations as expected:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer/dynamometer-*/bin
> share/hadoop/tools/dynamometer/dynamometer-blockgen/bin:
> generate-block-lists.sh
> share/hadoop/tools/dynamometer/dynamometer-infra/bin:
> create-slim-hadoop-tar.shparse-metrics.sh 
> start-dynamometer-cluster.sh upload-fsimage.sh
> share/hadoop/tools/dynamometer/dynamometer-workload/bin:
> parse-start-timestamp.sh start-workload.sh
> {code}
> But for blockgen specifically, it also ends up in another folder:
> {code}
> ekrogen at ekrogen-mn6 in 
> ~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
> ekrogen-HDFS-14410-dyno-docs!
> ± ls share/hadoop/tools/dynamometer-blockgen/bin
> generate-block-lists.sh
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895609#comment-16895609
 ] 

Erik Krogen commented on HDFS-13783:


LGTM! Last ping for any of the watchers to take a look; otherwise I will commit 
this tomorrow.

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch, HDFS-13783.005.patch, 
> HDFS-13783.006.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-29 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-12703:
---
Fix Version/s: 3.0.4
   2.10.0

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, 
> HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, 
> HDFS-12703.012.patch, HDFS-12703.013.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895610#comment-16895610
 ] 

Erik Krogen commented on HDFS-14672:


Thanks [~hexiaoqiao] and [~elgoiri]! I just committed this to branch-2. I also 
cherry-picked the branch-3.1 commit into branch-3.0 for the sake of consistency.

> Backport HDFS-12703 to branch-2
> ---
>
> Key: HDFS-14672
> URL: https://issues.apache.org/jira/browse/HDFS-14672
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12703.branch-2.001.patch, 
> HDFS-12703.branch-2.002.patch, HDFS-12703.branch-2.003.patch
>
>
> Currently, `decommission monitor exception cause namenode fatal` is only in 
> trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-29 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14672:
---
Fix Version/s: 2.10.0

> Backport HDFS-12703 to branch-2
> ---
>
> Key: HDFS-14672
> URL: https://issues.apache.org/jira/browse/HDFS-14672
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.10.0
>
> Attachments: HDFS-12703.branch-2.001.patch, 
> HDFS-12703.branch-2.002.patch, HDFS-12703.branch-2.003.patch
>
>
> Currently, `decommission monitor exception cause namenode fatal` is only in 
> trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14672) Backport HDFS-12703 to branch-2

2019-07-29 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14672:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Backport HDFS-12703 to branch-2
> ---
>
> Key: HDFS-14672
> URL: https://issues.apache.org/jira/browse/HDFS-14672
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12703.branch-2.001.patch, 
> HDFS-12703.branch-2.002.patch, HDFS-12703.branch-2.003.patch
>
>
> Currently, `decommission monitor exception cause namenode fatal` is only in 
> trunk (branch-3). This JIRA aims to backport this bugfix to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895611#comment-16895611
 ] 

Erik Krogen commented on HDFS-12703:


Marked the fix version of 2.10 here as this was committed into branch-2 as part 
of HDFS-14672. I also landed this in branch-3.0 along the way.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch, HDFS-12703.007.patch, HDFS-12703.008.patch, 
> HDFS-12703.009.patch, HDFS-12703.010.patch, HDFS-12703.011.patch, 
> HDFS-12703.012.patch, HDFS-12703.013.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-29 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895614#comment-16895614
 ] 

Erik Krogen commented on HDFS-14370:


Hey [~ayushtkn], thanks for taking a look. You raise great points. I am 
thinking we have two options:

# Set the default value of the maximum time to be -1. If this value is 
encountered, disable the backoff by setting the maximum to be equal to the 
minimum time. I think the drawback here is that -1 could also be interpreted as 
"no maximum," so this behavior may be misleading to some users.
# Change the backoff config to be a multiplier of the minimum sleep time. Set 
the default to be 1, effectively disabling backoff. This has the advantage of 
more consistent behavior, but determining a reasonable value may be a bit more 
difficult (more math involved).

Let me know what you think.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-07-30 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896462#comment-16896462
 ] 

Erik Krogen edited comment on HDFS-14462 at 7/30/19 7:58 PM:
-

Good find, thanks [~simbadzina]. I took a look at the v1 patch:
# It's possible for {{validateResponse}} not to throw anything, so I think we 
need to do:
{code}
  } catch (IOException e) {
validateResponse(op, conn, true);
throw e;
  }
{code}
to ensure that we don't swallow and permanently lose a failure. Maybe we should 
also log {{e}} since we are masking it? I'm not sure if it will ever contain 
useful information.
# You have an unused {{DFSAdmin}} in your test
# You can just use {{assertTrue()}} instead of {{Assert.assertTrue()}}
# In your {{setQuota}} command you're also setting the {{namespaceQuota}} equal 
to the {{spaceQuota}}, you probably want {{HdfsConstants#QUOTA_DONT_SET}} for 
name quota
# Can we use smaller quota and file sizes? 500MB seems pretty large
# Right now if the {{DSQuotaExceededException}} is thrown from the {{close()}} 
call, the test still succeeds. Can we make the test enforce that it should be 
the {{write()}} method which throws?
# Make sure to fix the checkstyle, and also typically we don't use star-imports 
({{import static package.Class.*}})


was (Author: xkrogen):
Good find, thanks [~simbadzina]. I took a look at the v1 patch:
# It's possible for {{validateResponse}} not to throw anything, so I think we 
need to do:
{code}
  } catch (IOException e) {
validateResponse(op, conn, true);
throw e;
  }
{code}
to ensure that we don't swallow and permanently lose a failure. Maybe we should 
also log {{e}} since we are masking it? I'm not sure if it will ever contain 
useful information.
# You have an unused {{DFSAdmin}} in your test
# You can just use {{assertTrue()}} instead of {{Assert.assertTrue()}}
# In your {{setQuota}} command you're also setting the {{namespaceQuota}} equal 
to the {{spaceQuota}}, you probably want {{HdfsConstants#QUOTA_DONT_SET}} for 
name quota
# Can we use smaller quota and file sizes? 500MB seems pretty large
# Right now if the {{DSQuotaExceededException}} is thrown from the {{close()}} 
call, the test still succeeds. Can we make the test enforce that it should be 
the {{write()}} method which throws?

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Attachments: HDFS-14462.001.patch
>
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-07-30 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896462#comment-16896462
 ] 

Erik Krogen commented on HDFS-14462:


Good find, thanks [~simbadzina]. I took a look at the v1 patch:
# It's possible for {{validateResponse}} not to throw anything, so I think we 
need to do:
{code}
  } catch (IOException e) {
validateResponse(op, conn, true);
throw e;
  }
{code}
to ensure that we don't swallow and permanently lose a failure. Maybe we should 
also log {{e}} since we are masking it? I'm not sure if it will ever contain 
useful information.
# You have an unused {{DFSAdmin}} in your test
# You can just use {{assertTrue()}} instead of {{Assert.assertTrue()}}
# In your {{setQuota}} command you're also setting the {{namespaceQuota}} equal 
to the {{spaceQuota}}, you probably want {{HdfsConstants#QUOTA_DONT_SET}} for 
name quota
# Can we use smaller quota and file sizes? 500MB seems pretty large
# Right now if the {{DSQuotaExceededException}} is thrown from the {{close()}} 
call, the test still succeeds. Can we make the test enforce that it should be 
the {{write()}} method which throws?

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Attachments: HDFS-14462.001.patch
>
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-30 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13783:
---
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk. Thanks for the contribution [~zhangchen]!

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch, HDFS-13783.005.patch, 
> HDFS-13783.006.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

2019-07-30 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-13783:
---
Release Note: Adds a new parameter to the Balancer CLI, "-asService", to 
enable the process to be long-running.

> Balancer: make balancer to be a long service process for easy to monitor it.
> 
>
> Key: HDFS-13783
> URL: https://issues.apache.org/jira/browse/HDFS-13783
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: maobaolong
>Assignee: Chen Zhang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch, HDFS-13783.005.patch, 
> HDFS-13783.006.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14662) Document the usage of the new Balancer "asService" parameter

2019-07-31 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897277#comment-16897277
 ] 

Erik Krogen commented on HDFS-14662:


[~zhangchen] thanks for tackling this! Can we also add some information on the 
new configuration parameters that were added to adjust the balancer service's 
behavior?

> Document the usage of the new Balancer "asService" parameter
> 
>
> Key: HDFS-14662
> URL: https://issues.apache.org/jira/browse/HDFS-14662
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14662.001.patch
>
>
> see HDFS-13783, this jira add document for how to run balancer as a long 
> service



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-31 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.001.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-31 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897319#comment-16897319
 ] 

Erik Krogen commented on HDFS-14370:


I don't really like treating empty values specially, as it makes it impossible 
to override back to the "empty" behavior when specifying options on the command 
line (using {{-Dconfig.key=value}} flags). I've updated the patch to simply 
consider any negative values (default of -1) as disabling backoff; I also made 
it more clear in the code that backoff is explicitly disabled in this case.

v001 patch also includes a test case.

{quote}
For the 0 case may be we should consider, not allowing back-off, Doesn't as 
such make sense, since fractions of no load and we are in back-off state, that 
too the time to start with we have decided..but still can be discussed.
{quote}
I don't understand this part, can you explain further? Are you saying when the 
config key is set to be 0? 

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-31 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897450#comment-16897450
 ] 

Erik Krogen commented on HDFS-14034:


[~csun] I looked quickly at the branch-2 backport. Were there any significant 
conflicts from the 3.x line?

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-31 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.002.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-07-31 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897456#comment-16897456
 ] 

Erik Krogen commented on HDFS-14370:


Uploaded v002 patch to address the checkstyle issue.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-08-01 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898145#comment-16898145
 ] 

Erik Krogen commented on HDFS-14462:


{quote}With regards to item #6, try with resources throws the error in the try 
portion and suppresses the one in the close. When I don't throw a 
DSQuotaExceededException in the try, then the generic HTTPURLConnection error 
is the one which the test throws.
{quote}
We should be able to achieve this without the try-with-resources, something 
like:
{code:java}
try {
  // do a write which triggers the quota exception
  fail("should have thrown exception");
} catch (DSQuotaExceededException e) {
  // expected
} finally {
  out.close();
}
{code}
For the new log statement, I don't think it should be at "error" level, 
probably just "warn" (maybe even "info"?), since we can expect this to happen 
in normal circumstances like a quota exception. See this [helpful 
guide|https://en.wikipedia.org/wiki/Log4j#Log4j_log_levels] on log level 
semantics. It would also be nice to include some additional information, like:
{code:java}
LOG.warn("Write to output stream for file {} failed. Attempting to fetch the 
cause from the stream", fspath, e);
{code}
It's good to remember that many people who look at these logs won't bother to 
look at the place in the code where the log statement originates, so some 
context is helpful.

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Attachments: HDFS-14462.001.patch, HDFS-14462.002.patch
>
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14638) [Dynamometer] Fix scripts to refer to current build structure

2019-08-01 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14638:
---
Description: 
The scripts within the Dynamometer build dirs all refer to the old distribution 
structure with a single {{bin}} directory and a single {{lib}} directory. We 
need to update them to refer to the Hadoop-standard layout.

Also as pointed out by [~pingsutw]:
{quote}
Due to dynamometer rename to hadoop-dynamometer in hadoop-tools

but we still use old name of jar inside the scripts
{code}
"$hadoop_cmd" jar "${script_pwd}"/lib/dynamometer-infra-*.jar 
org.apache.hadoop.tools.dynamometer.Client "$@"
{code}
We should rename these jar inside the scripts
{quote}

  was:The scripts within the Dynamometer build dirs all refer to the old 
distribution structure with a single {{bin}} directory and a single {{lib}} 
directory. We need to update them to refer to the Hadoop-standard layout.


> [Dynamometer] Fix scripts to refer to current build structure
> -
>
> Key: HDFS-14638
> URL: https://issues.apache.org/jira/browse/HDFS-14638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Priority: Major
>
> The scripts within the Dynamometer build dirs all refer to the old 
> distribution structure with a single {{bin}} directory and a single {{lib}} 
> directory. We need to update them to refer to the Hadoop-standard layout.
> Also as pointed out by [~pingsutw]:
> {quote}
> Due to dynamometer rename to hadoop-dynamometer in hadoop-tools
> but we still use old name of jar inside the scripts
> {code}
> "$hadoop_cmd" jar "${script_pwd}"/lib/dynamometer-infra-*.jar 
> org.apache.hadoop.tools.dynamometer.Client "$@"
> {code}
> We should rename these jar inside the scripts
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14281) Dynamometer Phase 2

2019-08-01 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898385#comment-16898385
 ] 

Erik Krogen commented on HDFS-14281:


Thanks [~pingsutw]! This is a great point. I'll make a note in HDFS-14638 which 
proposes to solve a similar issue.

> Dynamometer Phase 2
> ---
>
> Key: HDFS-14281
> URL: https://issues.apache.org/jira/browse/HDFS-14281
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: test, tools
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> Phase 1: HDFS-12345
> This is the Phase 2 umbrella jira.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-02 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.003.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899090#comment-16899090
 ] 

Erik Krogen commented on HDFS-14370:


Thanks [~vagarychen]! The description did mention that negative values disable 
the backoff behavior, but I made it more clear that this is the default.

[~ayushtkn], I will commit this early next week unless I hear back from you, 
please let me know if you have any concerns with the v003 patch.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899090#comment-16899090
 ] 

Erik Krogen edited comment on HDFS-14370 at 8/2/19 5:58 PM:


Thanks [~vagarychen]! The description did mention that negative values disable 
the backoff behavior, but I made it more clear that this is the default in the 
v003 patch.

[~ayushtkn], I will commit this early next week unless I hear back from you, 
please let me know if you have any concerns with the v003 patch.


was (Author: xkrogen):
Thanks [~vagarychen]! The description did mention that negative values disable 
the backoff behavior, but I made it more clear that this is the default.

[~ayushtkn], I will commit this early next week unless I hear back from you, 
please let me know if you have any concerns with the v003 patch.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-08-02 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14462:
---
   Resolution: Fixed
Fix Version/s: 3.1.3
   3.2.1
   3.3.0
   3.0.4
   2.10.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk, including backports down to branch-2. Thanks 
for the contribution [~simbadzina]!

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14462.001.patch, HDFS-14462.002.patch, 
> HDFS-14462.003.patch, HDFS-14462.004.patch
>
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException

2019-08-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899147#comment-16899147
 ] 

Erik Krogen edited comment on HDFS-14462 at 8/2/19 7:05 PM:


TestLargeBlockReport is failing consistently on trunk and 
TestUnderReplicatedBlocks is notoriously flaky.

I just committed this to trunk, including backports down to branch-2. Thanks 
for the contribution [~simbadzina]!


was (Author: xkrogen):
I just committed this to trunk, including backports down to branch-2. Thanks 
for the contribution [~simbadzina]!

> WebHDFS throws "Error writing request body to server" instead of 
> DSQuotaExceededException
> -
>
> Key: HDFS-14462
> URL: https://issues.apache.org/jira/browse/HDFS-14462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 2.7.7, 3.1.2
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14462.001.patch, HDFS-14462.002.patch, 
> HDFS-14462.003.patch, HDFS-14462.004.patch
>
>
> We noticed recently in our environment that, when writing data to HDFS via 
> WebHDFS, a quota exception is returned to the client as:
> {code}
> java.io.IOException: Error writing request body to server
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
>  ~[?:1.8.0_172]
> at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
>  ~[?:1.8.0_172]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_172]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
> ~[?:1.8.0_172]
> at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
> ~[?:1.8.0_172]
> {code}
> It is entirely opaque to the user that this exception was caused because they 
> exceeded their quota. Yet in the DataNode logs:
> {code}
> 2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
> consumed =  B = X TB
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
> at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
> {code}
> This was on a 2.7.x cluster, but I verified that the same logic exists on 
> trunk. I believe we need to fix some of the logic within the 
> {{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14513) FSImage which is saving should be clean while NameNode shutdown

2019-08-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899206#comment-16899206
 ] 

Erik Krogen commented on HDFS-14513:


[~elgoiri] [~hexiaoqiao] do you think this is a good candidate for older 
release lines? It seems it's a fairly harmless change?

> FSImage which is saving should be clean while NameNode shutdown
> ---
>
> Key: HDFS-14513
> URL: https://issues.apache.org/jira/browse/HDFS-14513
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14513.001.patch, HDFS-14513.002.patch, 
> HDFS-14513.003.patch, HDFS-14513.004.patch, HDFS-14513.005.patch, 
> HDFS-14513.006.patch, HDFS-14513.007.patch
>
>
> Checkpointer/FSImageSaver is regular tasks and dump NameNode meta to disk, at 
> most per hour by default. If it receive some command (e.g. transition to 
> active in HA mode) it will cancel checkpoint and delete tmp files using 
> {{FSImage#deleteCancelledCheckpoint}}. However if NameNode shutdown when 
> checkpoint, the tmp files will not be cleaned anymore. 
> Consider there are 500m inodes+blocks, it could cost 5~10min to finish once 
> checkpoint, if we shutdown NameNode during checkpointing, fsimage checkpoint 
> file will never be cleaned, after long time, there could be many useless 
> checkpoint files. So I propose that we should add hook to clean that when 
> shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14694) Call recoverLease on DFSOutputStream close exception

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901181#comment-16901181
 ] 

Erik Krogen commented on HDFS-14694:


I don't have cycles for a detailed review, but the overall idea seems solid to 
me. Thanks for opening this up [~zhangchen]!

> Call recoverLease on DFSOutputStream close exception
> 
>
> Key: HDFS-14694
> URL: https://issues.apache.org/jira/browse/HDFS-14694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14694.001.patch, HDFS-14694.002.patch
>
>
> HDFS uses file-lease to manage opened files, when a file is not closed 
> normally, NN will recover lease automatically after hard limit exceeded. But 
> for a long running service(e.g. HBase), the hdfs-client will never die and NN 
> don't have any chances to recover the file.
> Usually client program needs to handle exceptions by themself to avoid this 
> condition(e.g. HBase automatically call recover lease for files that not 
> closed normally), but in our experience, most services (in our company) don't 
> process this condition properly, which will cause lots of files in abnormal 
> status or even data loss.
> This Jira propose to add a feature that call recoverLease operation 
> automatically when DFSOutputSteam close encounters exception. It should be 
> disabled by default, but when somebody builds a long-running service based on 
> HDFS, they can enable this option.
> We've add this feature to our internal Hadoop distribution for more than 3 
> years, it's quite useful according our experience.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14697) Backport HDFS-14513 to branch-2

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901229#comment-16901229
 ] 

Erik Krogen commented on HDFS-14697:


It looks like your build ran into some resource issues causing the failed 
tests. I ran them locally and they all succeeded. The ASF warnings are just 
from the error files produced.

I just committed this to branch-2. Thanks [~hexiaoqiao]!

> Backport HDFS-14513 to branch-2
> ---
>
> Key: HDFS-14697
> URL: https://issues.apache.org/jira/browse/HDFS-14697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Minor
> Attachments: HDFS-14697.branch-2.001.patch
>
>
> Backport HDFS-14513 "FSImage which is saving should be clean while NameNode 
> shutdown." to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14697) Backport HDFS-14513 to branch-2

2019-08-06 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14697:
---
   Resolution: Fixed
Fix Version/s: 2.10.0
   Status: Resolved  (was: Patch Available)

> Backport HDFS-14513 to branch-2
> ---
>
> Key: HDFS-14697
> URL: https://issues.apache.org/jira/browse/HDFS-14697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Minor
> Fix For: 2.10.0
>
> Attachments: HDFS-14697.branch-2.001.patch
>
>
> Backport HDFS-14513 "FSImage which is saving should be clean while NameNode 
> shutdown." to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14513) FSImage which is saving should be clean while NameNode shutdown

2019-08-06 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14513:
---
Fix Version/s: 3.1.3
   3.2.1
   3.0.4
   2.10.0

> FSImage which is saving should be clean while NameNode shutdown
> ---
>
> Key: HDFS-14513
> URL: https://issues.apache.org/jira/browse/HDFS-14513
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14513.001.patch, HDFS-14513.002.patch, 
> HDFS-14513.003.patch, HDFS-14513.004.patch, HDFS-14513.005.patch, 
> HDFS-14513.006.patch, HDFS-14513.007.patch
>
>
> Checkpointer/FSImageSaver is regular tasks and dump NameNode meta to disk, at 
> most per hour by default. If it receive some command (e.g. transition to 
> active in HA mode) it will cancel checkpoint and delete tmp files using 
> {{FSImage#deleteCancelledCheckpoint}}. However if NameNode shutdown when 
> checkpoint, the tmp files will not be cleaned anymore. 
> Consider there are 500m inodes+blocks, it could cost 5~10min to finish once 
> checkpoint, if we shutdown NameNode during checkpointing, fsimage checkpoint 
> file will never be cleaned, after long time, there could be many useless 
> checkpoint files. So I propose that we should add hook to clean that when 
> shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14697) Backport HDFS-14513 to branch-2

2019-08-06 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14697:
---
Fix Version/s: 3.1.3
   3.0.4

> Backport HDFS-14513 to branch-2
> ---
>
> Key: HDFS-14697
> URL: https://issues.apache.org/jira/browse/HDFS-14697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.1.3
>
> Attachments: HDFS-14697.branch-2.001.patch
>
>
> Backport HDFS-14513 "FSImage which is saving should be clean while NameNode 
> shutdown." to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14697) Backport HDFS-14513 to branch-2

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901229#comment-16901229
 ] 

Erik Krogen edited comment on HDFS-14697 at 8/6/19 4:51 PM:


It looks like your build ran into some resource issues causing the failed 
tests. I ran them locally and they all succeeded. The ASF warnings are just 
from the error files produced.

I just committed this to branch-2 (and used this for branch-3.1 and branch-3.0 
backports). Thanks [~hexiaoqiao]!


was (Author: xkrogen):
It looks like your build ran into some resource issues causing the failed 
tests. I ran them locally and they all succeeded. The ASF warnings are just 
from the error files produced.

I just committed this to branch-2. Thanks [~hexiaoqiao]!

> Backport HDFS-14513 to branch-2
> ---
>
> Key: HDFS-14697
> URL: https://issues.apache.org/jira/browse/HDFS-14697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.1.3
>
> Attachments: HDFS-14697.branch-2.001.patch
>
>
> Backport HDFS-14513 "FSImage which is saving should be clean while NameNode 
> shutdown." to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14513) FSImage which is saving should be clean while NameNode shutdown

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901267#comment-16901267
 ] 

Erik Krogen commented on HDFS-14513:


I backported this to branch-3.2, and used the branch-2 patch from HDFS-14697 
for branch-3.1 and branch-3.0 backports. So this should be everywhere trunk ~ 
branch-2 now. 

> FSImage which is saving should be clean while NameNode shutdown
> ---
>
> Key: HDFS-14513
> URL: https://issues.apache.org/jira/browse/HDFS-14513
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14513.001.patch, HDFS-14513.002.patch, 
> HDFS-14513.003.patch, HDFS-14513.004.patch, HDFS-14513.005.patch, 
> HDFS-14513.006.patch, HDFS-14513.007.patch
>
>
> Checkpointer/FSImageSaver is regular tasks and dump NameNode meta to disk, at 
> most per hour by default. If it receive some command (e.g. transition to 
> active in HA mode) it will cancel checkpoint and delete tmp files using 
> {{FSImage#deleteCancelledCheckpoint}}. However if NameNode shutdown when 
> checkpoint, the tmp files will not be cleaned anymore. 
> Consider there are 500m inodes+blocks, it could cost 5~10min to finish once 
> checkpoint, if we shutdown NameNode during checkpointing, fsimage checkpoint 
> file will never be cleaned, after long time, there could be many useless 
> checkpoint files. So I propose that we should add hook to clean that when 
> shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-06 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.004.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901273#comment-16901273
 ] 

Erik Krogen commented on HDFS-14370:


Thanks for looking [~ayushtkn]. I've updated the documentation and changed the 
default to a 0 in v004.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14631) The DirectoryScanner doesn't fix the wrongly placed replica.

2019-08-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901463#comment-16901463
 ] 

Erik Krogen commented on HDFS-14631:


[~LiJinglun] Do you know if this is also relevant to the 2.x line? If so, 
should we put together a branch-2 backport also? 

> The DirectoryScanner doesn't fix the wrongly placed replica.
> 
>
> Key: HDFS-14631
> URL: https://issues.apache.org/jira/browse/HDFS-14631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14631.001.patch, HDFS-14631.002.patch, 
> HDFS-14631.003.patch, HDFS-14631.004.patch
>
>
> When DirectoryScanner scans block files, if the block refers to the block 
> file does not exist the DirectoryScanner will update the block based on the 
> replica file found on the disk. See FsDatasetImpl#checkAndUpdate.
>  
> {code:java}
> /*
> * Block exists in volumeMap and the block file exists on the disk
> */
> // Compare block files
> if (memBlockInfo.blockDataExists()) {
>   ...
> } else {
>   // Block refers to a block file that does not exist.
>   // Update the block with the file found on the disk. Since the block
>   // file and metadata file are found as a pair on the disk, update
>   // the block based on the metadata file found on the disk
>   LOG.warn("Block file in replica "
>   + memBlockInfo.getBlockURI()
>   + " does not exist. Updating it to the file found during scan "
>   + diskFile.getAbsolutePath());
>   memBlockInfo.updateWithReplica(
>   StorageLocation.parse(diskFile.toString()));
>   LOG.warn("Updating generation stamp for block " + blockId
>   + " from " + memBlockInfo.getGenerationStamp() + " to " + diskGS);
>   memBlockInfo.setGenerationStamp(diskGS);
> }
> {code}
> But the DirectoryScanner doesn't really fix it because in 
> LocalReplica#parseBaseDir() the 'subdir' are ignored.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14662) Document the usage of the new Balancer "asService" parameter

2019-08-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902203#comment-16902203
 ] 

Erik Krogen commented on HDFS-14662:


+1 from me, sorry for the delay! Thanks [~zhangchen]

> Document the usage of the new Balancer "asService" parameter
> 
>
> Key: HDFS-14662
> URL: https://issues.apache.org/jira/browse/HDFS-14662
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14662.001.patch, HDFS-14662.002.patch, 
> HDFS-14662.003.patch
>
>
> see HDFS-13783, this jira add document for how to run balancer as a long 
> service



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14631) The DirectoryScanner doesn't fix the wrongly placed replica.

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14631:
---
Fix Version/s: (was: 2.9.3)

> The DirectoryScanner doesn't fix the wrongly placed replica.
> 
>
> Key: HDFS-14631
> URL: https://issues.apache.org/jira/browse/HDFS-14631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14631-branch-2.9.001.patch, HDFS-14631.001.patch, 
> HDFS-14631.002.patch, HDFS-14631.003.patch, HDFS-14631.004.patch
>
>
> When DirectoryScanner scans block files, if the block refers to the block 
> file does not exist the DirectoryScanner will update the block based on the 
> replica file found on the disk. See FsDatasetImpl#checkAndUpdate.
>  
> {code:java}
> /*
> * Block exists in volumeMap and the block file exists on the disk
> */
> // Compare block files
> if (memBlockInfo.blockDataExists()) {
>   ...
> } else {
>   // Block refers to a block file that does not exist.
>   // Update the block with the file found on the disk. Since the block
>   // file and metadata file are found as a pair on the disk, update
>   // the block based on the metadata file found on the disk
>   LOG.warn("Block file in replica "
>   + memBlockInfo.getBlockURI()
>   + " does not exist. Updating it to the file found during scan "
>   + diskFile.getAbsolutePath());
>   memBlockInfo.updateWithReplica(
>   StorageLocation.parse(diskFile.toString()));
>   LOG.warn("Updating generation stamp for block " + blockId
>   + " from " + memBlockInfo.getGenerationStamp() + " to " + diskGS);
>   memBlockInfo.setGenerationStamp(diskGS);
> }
> {code}
> But the DirectoryScanner doesn't really fix it because in 
> LocalReplica#parseBaseDir() the 'subdir' are ignored.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Attachment: HDFS-14370.005.patch

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch, 
> HDFS-14370.005.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902228#comment-16902228
 ] 

Erik Krogen commented on HDFS-14370:


Thanks [~ayushtkn]. Addressed this last comment in v005 and went ahead to 
commit based on the three +1s in this thread. I committed to trunk and 
backported to the 3.x line. I think this needs to land in branch-2 as well once 
HDFS-14204 is completed, so I will leave this open for now.

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch, 
> HDFS-14370.005.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Fix Version/s: 3.1.3
   3.2.1
   3.3.0
   3.0.4

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch, 
> HDFS-14370.005.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14370:
---
Comment: was deleted

(was: | (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-14370 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14370 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976955/HDFS-14370.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27441/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.

)

> Edit log tailing fast-path should allow for backoff
> ---
>
> Key: HDFS-14370
> URL: https://issues.apache.org/jira/browse/HDFS-14370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, qjm
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14370.000.patch, HDFS-14370.001.patch, 
> HDFS-14370.002.patch, HDFS-14370.003.patch, HDFS-14370.004.patch, 
> HDFS-14370.005.patch
>
>
> As part of HDFS-13150, in-progress edit log tailing was changed to use an 
> RPC-based mechanism, thus allowing the edit log tailing frequency to be 
> turned way down, and allowing standby/observer NameNodes to be only a few 
> milliseconds stale as compared to the Active NameNode.
> When there is a high volume of transactions on the system, each RPC fetches 
> transactions and takes some time to process them, self-rate-limiting how 
> frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
> these RPCs return an empty set of transactions, consuming a high 
> (de)serialization overhead for very little benefit. This was reported by 
> [~jojochuang] in HDFS-14276 and I have also seen it on a test cluster where 
> the SbNN was submitting 8000 RPCs per second that returned empty.
> I propose we add some sort of backoff to the tailing, so that if an empty 
> response is received, it will wait a longer period of time before submitting 
> a new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14631) The DirectoryScanner doesn't fix the wrongly placed replica.

2019-08-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902464#comment-16902464
 ] 

Erik Krogen commented on HDFS-14631:


{{TestDirectoryStructure.testScanDirectoryStructureWarn}} and 
{{TestSafeMode.testInitializeReplQueuesEarly}} are both failing for me with or 
without this patch. All of the other tests pass locally.

I just committed v001 to branch-2 and branch-2.9. Thanks [~LiJinglun]!

> The DirectoryScanner doesn't fix the wrongly placed replica.
> 
>
> Key: HDFS-14631
> URL: https://issues.apache.org/jira/browse/HDFS-14631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14631-branch-2.9.001.patch, HDFS-14631.001.patch, 
> HDFS-14631.002.patch, HDFS-14631.003.patch, HDFS-14631.004.patch
>
>
> When DirectoryScanner scans block files, if the block refers to the block 
> file does not exist the DirectoryScanner will update the block based on the 
> replica file found on the disk. See FsDatasetImpl#checkAndUpdate.
>  
> {code:java}
> /*
> * Block exists in volumeMap and the block file exists on the disk
> */
> // Compare block files
> if (memBlockInfo.blockDataExists()) {
>   ...
> } else {
>   // Block refers to a block file that does not exist.
>   // Update the block with the file found on the disk. Since the block
>   // file and metadata file are found as a pair on the disk, update
>   // the block based on the metadata file found on the disk
>   LOG.warn("Block file in replica "
>   + memBlockInfo.getBlockURI()
>   + " does not exist. Updating it to the file found during scan "
>   + diskFile.getAbsolutePath());
>   memBlockInfo.updateWithReplica(
>   StorageLocation.parse(diskFile.toString()));
>   LOG.warn("Updating generation stamp for block " + blockId
>   + " from " + memBlockInfo.getGenerationStamp() + " to " + diskGS);
>   memBlockInfo.setGenerationStamp(diskGS);
> }
> {code}
> But the DirectoryScanner doesn't really fix it because in 
> LocalReplica#parseBaseDir() the 'subdir' are ignored.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14631) The DirectoryScanner doesn't fix the wrongly placed replica.

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14631:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> The DirectoryScanner doesn't fix the wrongly placed replica.
> 
>
> Key: HDFS-14631
> URL: https://issues.apache.org/jira/browse/HDFS-14631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 2.10.0, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-14631-branch-2.9.001.patch, HDFS-14631.001.patch, 
> HDFS-14631.002.patch, HDFS-14631.003.patch, HDFS-14631.004.patch
>
>
> When DirectoryScanner scans block files, if the block refers to the block 
> file does not exist the DirectoryScanner will update the block based on the 
> replica file found on the disk. See FsDatasetImpl#checkAndUpdate.
>  
> {code:java}
> /*
> * Block exists in volumeMap and the block file exists on the disk
> */
> // Compare block files
> if (memBlockInfo.blockDataExists()) {
>   ...
> } else {
>   // Block refers to a block file that does not exist.
>   // Update the block with the file found on the disk. Since the block
>   // file and metadata file are found as a pair on the disk, update
>   // the block based on the metadata file found on the disk
>   LOG.warn("Block file in replica "
>   + memBlockInfo.getBlockURI()
>   + " does not exist. Updating it to the file found during scan "
>   + diskFile.getAbsolutePath());
>   memBlockInfo.updateWithReplica(
>   StorageLocation.parse(diskFile.toString()));
>   LOG.warn("Updating generation stamp for block " + blockId
>   + " from " + memBlockInfo.getGenerationStamp() + " to " + diskGS);
>   memBlockInfo.setGenerationStamp(diskGS);
> }
> {code}
> But the DirectoryScanner doesn't really fix it because in 
> LocalReplica#parseBaseDir() the 'subdir' are ignored.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14631) The DirectoryScanner doesn't fix the wrongly placed replica.

2019-08-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14631:
---
Fix Version/s: 2.9.3
   2.10.0

> The DirectoryScanner doesn't fix the wrongly placed replica.
> 
>
> Key: HDFS-14631
> URL: https://issues.apache.org/jira/browse/HDFS-14631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 2.10.0, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-14631-branch-2.9.001.patch, HDFS-14631.001.patch, 
> HDFS-14631.002.patch, HDFS-14631.003.patch, HDFS-14631.004.patch
>
>
> When DirectoryScanner scans block files, if the block refers to the block 
> file does not exist the DirectoryScanner will update the block based on the 
> replica file found on the disk. See FsDatasetImpl#checkAndUpdate.
>  
> {code:java}
> /*
> * Block exists in volumeMap and the block file exists on the disk
> */
> // Compare block files
> if (memBlockInfo.blockDataExists()) {
>   ...
> } else {
>   // Block refers to a block file that does not exist.
>   // Update the block with the file found on the disk. Since the block
>   // file and metadata file are found as a pair on the disk, update
>   // the block based on the metadata file found on the disk
>   LOG.warn("Block file in replica "
>   + memBlockInfo.getBlockURI()
>   + " does not exist. Updating it to the file found during scan "
>   + diskFile.getAbsolutePath());
>   memBlockInfo.updateWithReplica(
>   StorageLocation.parse(diskFile.toString()));
>   LOG.warn("Updating generation stamp for block " + blockId
>   + " from " + memBlockInfo.getGenerationStamp() + " to " + diskGS);
>   memBlockInfo.setGenerationStamp(diskGS);
> }
> {code}
> But the DirectoryScanner doesn't really fix it because in 
> LocalReplica#parseBaseDir() the 'subdir' are ignored.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12914) Block report leases cause missing blocks until next report

2019-08-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902478#comment-16902478
 ] 

Erik Krogen commented on HDFS-12914:


[~jojochuang] I think this is breaking 
{{TestSafeMode.testInitializeReplQueuesEarly}} in branch-2 and branch-2.9. It 
seems to be failing consistently for me after this patch and succeeding before.

> Block report leases cause missing blocks until next report
> --
>
> Key: HDFS-12914
> URL: https://issues.apache.org/jira/browse/HDFS-12914
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.2
>Reporter: Daryn Sharp
>Assignee: Santosh Marella
>Priority: Critical
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-12914-branch-2.001.patch, 
> HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, 
> HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, 
> HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, 
> HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, 
> HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, 
> HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, 
> HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, 
> HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for 
> conditions such as "unknown datanode", "not in pending set", "lease has 
> expired", wrong lease id, etc.  Lease rejection does not throw an exception.  
> It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and 
> interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes 
> active with _no blocks_.  A replication storm ensues possibly causing DNs to 
> temporarily go dead (HDFS-12645), leading to more FBR lease rejections on 
> re-registration.  The cluster will have many "missing blocks" until the DNs 
> next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-08-08 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14034:
---
Fix Version/s: 3.2.1

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-14034-branch-2.000.patch, 
> HDFS-14034-branch-2.001.patch, HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-08-08 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14034:
---
Fix Version/s: 2.10.0

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 2.10.0, 3.3.0, 3.2.1
>
> Attachments: HDFS-14034-branch-2.000.patch, 
> HDFS-14034-branch-2.001.patch, HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-08-08 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14034:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14034-branch-2.000.patch, 
> HDFS-14034-branch-2.001.patch, HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-08-08 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903207#comment-16903207
 ] 

Erik Krogen commented on HDFS-14034:


I cherry-picked this to branch-3.2, branch-3.1, and branch-3.0. There were 
minor conflicts going from 3.2 to 3.1 due to logging changes (commons-logging 
to SLF4J). I also committed your branch-2 patch.

Thanks for the contribution [~csun]!

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14034-branch-2.000.patch, 
> HDFS-14034-branch-2.001.patch, HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-08-08 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14034:
---
Fix Version/s: 3.1.3
   3.0.4

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14034-branch-2.000.patch, 
> HDFS-14034-branch-2.001.patch, HDFS-14034.000.patch, HDFS-14034.001.patch, 
> HDFS-14034.002.patch, HDFS-14034.004.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14272) [SBN read] ObserverReadProxyProvider should sync with active txnID on startup

2019-03-01 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782064#comment-16782064
 ] 

Erik Krogen commented on HDFS-14272:


Based on the +1 from [~shv] and the earlier review from [~csun], I just 
committed this to trunk. Thanks all!

> [SBN read] ObserverReadProxyProvider should sync with active txnID on startup
> -
>
> Key: HDFS-14272
> URL: https://issues.apache.org/jira/browse/HDFS-14272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
> Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + 
> SSL + Kerberos + RPC encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch, 
> HDFS-14272.002.patch
>
>
> It is typical for integration tests to create some files and then check their 
> existence. For example, like the following simple bash script:
> {code:java}
> # hdfs dfs -touchz /tmp/abc
> # hdfs dfs -ls /tmp/abc
> {code}
> The test executes HDFS bash command sequentially, but it may fail with 
> Consistent Standby Read because the -ls does not find the file.
> Analysis: the second bash command, while launched sequentially after the 
> first one, is not aware of the state id returned from the first bash command. 
> So ObserverNode wouldn't wait for the the edits to get propagated, and thus 
> fails.
> I've got a cluster where the Observer has tens of seconds of RPC latency, and 
> this becomes very annoying. (I am still trying to figure out why this 
> Observer has such a long RPC latency. But that's another story.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14272) [SBN read] ObserverReadProxyProvider should sync with active txnID on startup

2019-03-01 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14272:
---
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

> [SBN read] ObserverReadProxyProvider should sync with active txnID on startup
> -
>
> Key: HDFS-14272
> URL: https://issues.apache.org/jira/browse/HDFS-14272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
> Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + 
> SSL + Kerberos + RPC encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch, 
> HDFS-14272.002.patch
>
>
> It is typical for integration tests to create some files and then check their 
> existence. For example, like the following simple bash script:
> {code:java}
> # hdfs dfs -touchz /tmp/abc
> # hdfs dfs -ls /tmp/abc
> {code}
> The test executes HDFS bash command sequentially, but it may fail with 
> Consistent Standby Read because the -ls does not find the file.
> Analysis: the second bash command, while launched sequentially after the 
> first one, is not aware of the state id returned from the first bash command. 
> So ObserverNode wouldn't wait for the the edits to get propagated, and thus 
> fails.
> I've got a cluster where the Observer has tens of seconds of RPC latency, and 
> this becomes very annoying. (I am still trying to figure out why this 
> Observer has such a long RPC latency. But that's another story.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-03-05 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-12345:
--

Assignee: Siyao Meng  (was: Erik Krogen)

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Siyao Meng
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch, 
> HDFS-12345.002.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-03-05 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784597#comment-16784597
 ] 

Erik Krogen commented on HDFS-12345:


Hey [~smeng], as you're pursuing this more actively than I am at the moment, 
I've assigned the JIRA to you so that you can have more control it. I am happy 
to help as much as I can but don't have bandwidth to push this forward on my 
own at the moment. Feel free to assign it back to me at any time.

By the way, the {{copy-dependencies}} seems fine to me!

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Siyao Meng
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch, 
> HDFS-12345.002.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14317) Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

2019-03-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785876#comment-16785876
 ] 

Erik Krogen commented on HDFS-14317:


Hey [~ekanth], can you explain why the new test changes to 
{{TestFailureToReadEdits}} and {{TestEditLogTailer.createMiniDFSCluster}} 
introduced in v004 are necessary?

> Standby does not trigger edit log rolling when in-progress edit log tailing 
> is enabled
> --
>
> Key: HDFS-14317
> URL: https://issues.apache.org/jira/browse/HDFS-14317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Ekanth Sethuramalingam
>Assignee: Ekanth Sethuramalingam
>Priority: Critical
> Attachments: HDFS-14317.001.patch, HDFS-14317.002.patch, 
> HDFS-14317.003.patch, HDFS-14317.004.patch
>
>
> The standby uses the following method to check if it is time to trigger edit 
> log rolling on active.
> {code}
>   /**
>* @return true if the configured log roll period has elapsed.
>*/
>   private boolean tooLongSinceLastLoad() {
> return logRollPeriodMs >= 0 && 
>   (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
>   }
> {code}
> In doTailEdits(), lastLoadTimeMs is updated when standby is able to 
> successfully tail any edits
> {code}
>   if (editsLoaded > 0) {
> lastLoadTimeMs = monotonicNow();
>   }
> {code}
> The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and 
> {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing 
> enabled, tooLongSinceLastLoad() will almost never return true resulting in 
> edit logs not rolled for a long time until this configuration 
> {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect.
> [In our deployment, this resulted in in-progress edit logs getting deleted. 
> The sequence of events is that standby was able to checkpoint twice while the 
> in-progress edit log was growing on active. When the 
> NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, 
> it cleaned up the in-progress edit log from active and QJM (as the txnid on 
> in-progress edit log was older than the 2 most recent checkpoints) resulting 
> in irrecoverably losing a few minutes worth of metadata].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14317) Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

2019-03-06 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785876#comment-16785876
 ] 

Erik Krogen edited comment on HDFS-14317 at 3/6/19 5:10 PM:


Hey [~ekanth], can you explain why the new test changes to 
{{TestFailureToReadEdits}} and {{TestEditLogTailer.createMiniDFSCluster}} 
introduced in v004 are necessary? Why does this patch remove the checkpoint at 
txn ID 3?


was (Author: xkrogen):
Hey [~ekanth], can you explain why the new test changes to 
{{TestFailureToReadEdits}} and {{TestEditLogTailer.createMiniDFSCluster}} 
introduced in v004 are necessary?

> Standby does not trigger edit log rolling when in-progress edit log tailing 
> is enabled
> --
>
> Key: HDFS-14317
> URL: https://issues.apache.org/jira/browse/HDFS-14317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Ekanth Sethuramalingam
>Assignee: Ekanth Sethuramalingam
>Priority: Critical
> Attachments: HDFS-14317.001.patch, HDFS-14317.002.patch, 
> HDFS-14317.003.patch, HDFS-14317.004.patch
>
>
> The standby uses the following method to check if it is time to trigger edit 
> log rolling on active.
> {code}
>   /**
>* @return true if the configured log roll period has elapsed.
>*/
>   private boolean tooLongSinceLastLoad() {
> return logRollPeriodMs >= 0 && 
>   (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
>   }
> {code}
> In doTailEdits(), lastLoadTimeMs is updated when standby is able to 
> successfully tail any edits
> {code}
>   if (editsLoaded > 0) {
> lastLoadTimeMs = monotonicNow();
>   }
> {code}
> The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and 
> {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing 
> enabled, tooLongSinceLastLoad() will almost never return true resulting in 
> edit logs not rolled for a long time until this configuration 
> {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect.
> [In our deployment, this resulted in in-progress edit logs getting deleted. 
> The sequence of events is that standby was able to checkpoint twice while the 
> in-progress edit log was growing on active. When the 
> NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, 
> it cleaned up the in-progress edit log from active and QJM (as the txnid on 
> in-progress edit log was older than the 2 most recent checkpoints) resulting 
> in irrecoverably losing a few minutes worth of metadata].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14317) Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

2019-03-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14317:
---
Fix Version/s: 3.1.3
   3.2.1
   3.3.0
   3.0.4

> Standby does not trigger edit log rolling when in-progress edit log tailing 
> is enabled
> --
>
> Key: HDFS-14317
> URL: https://issues.apache.org/jira/browse/HDFS-14317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Ekanth Sethuramalingam
>Assignee: Ekanth Sethuramalingam
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14317.001.patch, HDFS-14317.002.patch, 
> HDFS-14317.003.patch, HDFS-14317.004.patch
>
>
> The standby uses the following method to check if it is time to trigger edit 
> log rolling on active.
> {code}
>   /**
>* @return true if the configured log roll period has elapsed.
>*/
>   private boolean tooLongSinceLastLoad() {
> return logRollPeriodMs >= 0 && 
>   (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
>   }
> {code}
> In doTailEdits(), lastLoadTimeMs is updated when standby is able to 
> successfully tail any edits
> {code}
>   if (editsLoaded > 0) {
> lastLoadTimeMs = monotonicNow();
>   }
> {code}
> The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and 
> {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing 
> enabled, tooLongSinceLastLoad() will almost never return true resulting in 
> edit logs not rolled for a long time until this configuration 
> {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect.
> [In our deployment, this resulted in in-progress edit logs getting deleted. 
> The sequence of events is that standby was able to checkpoint twice while the 
> in-progress edit log was growing on active. When the 
> NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, 
> it cleaned up the in-progress edit log from active and QJM (as the txnid on 
> in-progress edit log was older than the 2 most recent checkpoints) resulting 
> in irrecoverably losing a few minutes worth of metadata].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14317) Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

2019-03-07 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786971#comment-16786971
 ] 

Erik Krogen commented on HDFS-14317:


Thanks for the explanation, [~ekanth]. +1 from me. I just committed this to 
trunk, branch-3.2, branch-3.1, and branch-3.0. I think it should go into 
branch-2 as well but it does not apply cleanly there. Can you help provide a 
branch-2 patch [~ekanth]?

> Standby does not trigger edit log rolling when in-progress edit log tailing 
> is enabled
> --
>
> Key: HDFS-14317
> URL: https://issues.apache.org/jira/browse/HDFS-14317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Ekanth Sethuramalingam
>Assignee: Ekanth Sethuramalingam
>Priority: Critical
> Attachments: HDFS-14317.001.patch, HDFS-14317.002.patch, 
> HDFS-14317.003.patch, HDFS-14317.004.patch
>
>
> The standby uses the following method to check if it is time to trigger edit 
> log rolling on active.
> {code}
>   /**
>* @return true if the configured log roll period has elapsed.
>*/
>   private boolean tooLongSinceLastLoad() {
> return logRollPeriodMs >= 0 && 
>   (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
>   }
> {code}
> In doTailEdits(), lastLoadTimeMs is updated when standby is able to 
> successfully tail any edits
> {code}
>   if (editsLoaded > 0) {
> lastLoadTimeMs = monotonicNow();
>   }
> {code}
> The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and 
> {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing 
> enabled, tooLongSinceLastLoad() will almost never return true resulting in 
> edit logs not rolled for a long time until this configuration 
> {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect.
> [In our deployment, this resulted in in-progress edit logs getting deleted. 
> The sequence of events is that standby was able to checkpoint twice while the 
> in-progress edit log was growing on active. When the 
> NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, 
> it cleaned up the in-progress edit log from active and QJM (as the txnid on 
> in-progress edit log was older than the 2 most recent checkpoints) resulting 
> in irrecoverably losing a few minutes worth of metadata].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14317) Standby does not trigger edit log rolling when in-progress edit log tailing is enabled

2019-03-07 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14317:
---
Hadoop Flags: Reviewed

> Standby does not trigger edit log rolling when in-progress edit log tailing 
> is enabled
> --
>
> Key: HDFS-14317
> URL: https://issues.apache.org/jira/browse/HDFS-14317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Ekanth Sethuramalingam
>Assignee: Ekanth Sethuramalingam
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14317.001.patch, HDFS-14317.002.patch, 
> HDFS-14317.003.patch, HDFS-14317.004.patch
>
>
> The standby uses the following method to check if it is time to trigger edit 
> log rolling on active.
> {code}
>   /**
>* @return true if the configured log roll period has elapsed.
>*/
>   private boolean tooLongSinceLastLoad() {
> return logRollPeriodMs >= 0 && 
>   (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ;
>   }
> {code}
> In doTailEdits(), lastLoadTimeMs is updated when standby is able to 
> successfully tail any edits
> {code}
>   if (editsLoaded > 0) {
> lastLoadTimeMs = monotonicNow();
>   }
> {code}
> The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and 
> {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing 
> enabled, tooLongSinceLastLoad() will almost never return true resulting in 
> edit logs not rolled for a long time until this configuration 
> {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect.
> [In our deployment, this resulted in in-progress edit logs getting deleted. 
> The sequence of events is that standby was able to checkpoint twice while the 
> in-progress edit log was growing on active. When the 
> NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, 
> it cleaned up the in-progress edit log from active and QJM (as the txnid on 
> in-progress edit log was older than the 2 most recent checkpoints) resulting 
> in irrecoverably losing a few minutes worth of metadata].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >