[
https://issues.apache.org/jira/browse/HDFS-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622334#comment-16622334
]
Chao Sun commented on HDFS-13898:
---------------------------------
Thanks [~xkrogen] for the explanation. I like the idea of having some
particular exception to trigger ORPP to directly go to active - this could be
useful for cases like HDFS-13924. For this particular scenario (observer in
safemode) though, I think it's fine since I assume normally safemode only
happens when the observer is starting up, which should not be quite common.
Also, the safemode could last for quite a while and in my experience the chance
of RPCs hitting this error is quite high, so might better to have all clients
to re-direct to a different observer anyway.
Regarding the v002 patch, how about change the test name to
{{testObserverNodeSafeModeWithBlockLocations}}?
{{testObserverNodeSafeModeWithoutBlockLocations}} seems a little confusing to
me since we are testing the safe mode case with {{getBlockLocations}} calls.
About the {{HAState}} change, no particular reason except I wanted to make the
lines shorter :) I'm perfectly fine to change it back.
Will fix the style issues too.
> Throw retriable exception for getBlockLocations when ObserverNameNode is in
> safemode
> ------------------------------------------------------------------------------------
>
> Key: HDFS-13898
> URL: https://issues.apache.org/jira/browse/HDFS-13898
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Chao Sun
> Assignee: Chao Sun
> Priority: Major
> Attachments: HDFS-13898-HDFS-12943.000.patch,
> HDFS-13898-HDFS-12943.001.patch, HDFS-13898-HDFS-12943.002.patch
>
>
> When ObserverNameNode is in safe mode, {{getBlockLocations}} may throw safe
> mode exception if the given file doesn't have any block yet.
> {code}
> try {
> checkOperation(OperationCategory.READ);
> res = FSDirStatAndListingOp.getBlockLocations(
> dir, pc, srcArg, offset, length, true);
> if (isInSafeMode()) {
> for (LocatedBlock b : res.blocks.getLocatedBlocks()) {
> // if safemode & no block locations yet then throw safemodeException
> if ((b.getLocations() == null) || (b.getLocations().length == 0)) {
> SafeModeException se = newSafemodeException(
> "Zero blocklocations for " + srcArg);
> if (haEnabled && haContext != null &&
> haContext.getState().getServiceState() ==
> HAServiceState.ACTIVE) {
> throw new RetriableException(se);
> } else {
> throw se;
> }
> }
> }
> }
> {code}
> It only throws {{RetriableException}} for active NN so requests on observer
> may just fail.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]