[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-16 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017409#comment-17017409
 ] 

Hudson commented on HDFS-15112:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17873 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17873/])
HDFS-15112. RBF: Do not return FileNotFoundException when a subcluster 
(inigoiri: rev 263413e83840c7795a988e3939cd292d020c8d5f)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterFaultTolerant.java


> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.009.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017405#comment-17017405
 ] 

Íñigo Goiri commented on HDFS-15112:


Thanks [~ayushtkn] for the review.
Committed to trunk.
Opened HDFS-15127 to fix the writes.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.009.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017401#comment-17017401
 ] 

Íñigo Goiri commented on HDFS-15112:


Correct, the change in RouterRPCServer is only there to make the current 
testWriteWithFailedSubcluster() work.
Let's open a new JIRA to change testWriteWithFailedSubcluster() and make it 
always fail for writes if one subcluster is down.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.009.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-15 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016499#comment-17016499
 ] 

Ayush Saxena commented on HDFS-15112:
-

Thanx [~elgoiri] for the update. The overall patch LGTM

{code:java}
catch (IOException ioe) {
if (RouterRpcClient.isUnavailableException(ioe)) {
  LOG.debug("Ignore unavailable exception: {}", ioe);
} else {
  throw ioe;
}
  }
{code}
But I think, this we shouldn't do here, may be we can discuss this in the 
follow up where we handle invoke concurrent, I have hard feeling adding this. 
That test failure was a bug and surfaced a genuine issue, I guess there is no 
test like, if there is a non {{PathAll}} entry and a subcluster is down, write 
should fail. The test {{testWriteWithFailedSubcluster}} checks in case the 
mount entry is not fault tolerant, some only should fail, ideally all should 
fail, that is why this test is passing, only in case if the entry is fault 
tolerant, the write should be success, if there is a non available cluster.

This part as we decided, we can handle in another JIRA, but in that we have to 
remove this catch block only.

other than that, v009 LGTM +1

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.009.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-15 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016462#comment-17016462
 ] 

Hadoop QA commented on HDFS-15112:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
1s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991057/HDFS-15112.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4339eb2e7962 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5d18046 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28681/testReport/ |
| Max. process+thread count | 2779 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28681/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
>

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016435#comment-17016435
 ] 

Íñigo Goiri commented on HDFS-15112:


I added  [^HDFS-15112.009.patch] to handle {{StandbyException}} and 
{{NoNamenodesAvailableException}}.
Right now, we are compatible with what there was committed.

I agree that we should revisit writes too and make sure we don't allow writing 
in some cases without making sure that the file is there already.
I would do that part of the invokeConcurrent() too.
For now, this cover the existing behavior.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.009.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-15 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016251#comment-17016251
 ] 

Ayush Saxena commented on HDFS-15112:
-

bq. Any thoughts on RetriableException and StandbyException?
I think we should habdle these two also.

bq. That code in getCreateLocation() is only to check if the file actually 
exists, I think for that particular case we should be fine.
If a cluster is unavailable, and the file exists there? Then when the cluster 
comes up, there would be two files one in the old one and one that got created, 
Without the entry being fault tolerant.
{{getExistingLocation}} calls {{invokeConcurrent}} and we changed in 
{{invokeSequential}}, Does it affect here? 

I think the problem we tried to fix for {{invokeSequential}} is also there in 
{{invokeConcurrent}} too, because which the test failed randomly here :


{code:java}

// Throw the exception for the first location if there are no results
if (ret.isEmpty()) {
  final RemoteResult result = results.get(0);
  if (result.hasException()) {
throw result.getException();
  }
}
{code}

In case the the file is not there, there won't be any result, and if one NS 
isn't available, one result would be having {{UnavailableException}}, if that 
tends to be the first one, the {{UnavailableException}} would be thrown and 
write will fail, but if it isn't the first one {{FileNotFound}} from other NS 
will be thrown thus write would be success. 

I don't think the test failed because of changes here, May be a random failure 
because with v07 the test passes at my local. Give a check, if I am catching it 
correct, you can handle the {{invokeConcurrent}} one in different JIRA too.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016234#comment-17016234
 ] 

Íñigo Goiri commented on HDFS-15112:


That code in {{getCreateLocation()}} is only to check if the file actually 
exists, I think for that particular case we should be fine.

The write itself respects the fault tolerant flag as it shows in 
testWriteWithFailedSubcluster().

Any thoughts on RetriableException and StandbyException?

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015661#comment-17015661
 ] 

Ayush Saxena commented on HDFS-15112:
-

bq. So this change breaks the assumption that we can create files when a 
subcluster is down.

The assumption that we won't fail if a cluster is unavailable is only when the 
mount entry is fault tolerant, otherwise it should fail only?

If we catch and ignore here :

{code:java}
+  } catch (IOException ioe) {
+if (RouterRpcClient.isUnavailableException(ioe)) {
+  LOG.debug("Ignore unavailable exception: {}", ioe);
+} else {
+  throw ioe;
+}
{code}

We will be ignoring this exception in case the mount entry is not fault 
tolerant too.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.008.patch, 
> HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015598#comment-17015598
 ] 

Hadoop QA commented on HDFS-15112:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
57s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
48s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990930/HDFS-15112.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ea424aaf4ddf 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c36f09d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28674/testReport/ |
| Max. process+thread count | 2845 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28674/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
>

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015568#comment-17015568
 ] 

Íñigo Goiri commented on HDFS-15112:


I found the issue.
The problem is that now when we try to create a file, we also get a 
ConnectException now.
The creation in the Router triggers getBlockLocations() to see if the file 
exists.
So this change breaks the assumption that we can create files when a subcluster 
is down.
The issue is in getCreateLocation() which now cannot handle this.
I added a new check to ignore those in  [^HDFS-15112.008.patch].

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015490#comment-17015490
 ] 

Íñigo Goiri commented on HDFS-15112:


It looks like this also broke testWriteWithFailedSubcluster().
Let's see what's the issue now here.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015356#comment-17015356
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 10s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990902/HDFS-15112.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 684b3c9e61d4 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1c51f36 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28670/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28670/testReport/ |
| Max. process+thread count | 3121 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28670/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015295#comment-17015295
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 39s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990897/HDFS-15112.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 18a229608630 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1c51f36 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28668/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28668/testReport/ |
| Max. process+thread count | 3038 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28668/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015294#comment-17015294
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 42s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.fs.contract.router.TestRouterHDFSContractRenameSecure |
|   | hadoop.fs.contract.router.TestRouterHDFSContractGetFileStatus |
|   | hadoop.fs.contract.router.TestRouterHDFSContractSetTimes |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppend |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractOpen |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990897/HDFS-15112.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 96d8fb9e9865 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1c51f36 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28667/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015257#comment-17015257
 ] 

Íñigo Goiri commented on HDFS-15112:


[^HDFS-15112.007.patch] improves some of the comments.

{quote}
Just to confirm, do we need to handle StandbyException, 
NoNamenodesAvailableException or RetriableException precisely the exceptions in 
invokeMethod, these also denote cluster not available?
{quote}
Good points.
* StandbyException: probably is one to surface.
* NoNamenodesAvailableException: I don't think this can surface, right? We 
would wrap it with a RetriableException.
* RetriableException, this one might be tricky. This was part of HDFS-14230. 
[~ferhui], thoughts here?


> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.007.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015247#comment-17015247
 ] 

Ayush Saxena commented on HDFS-15112:
-

Thanx [~elgoiri]  for the update, Just to confirm, do we need to handle 
{{StandbyException}}, {{NoNamenodesAvailableException}} or 
{{RetriableException}} precisely the exceptions in {{invokeMethod}}, these also 
denote cluster not available? 

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.006.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015229#comment-17015229
 ] 

Íñigo Goiri commented on HDFS-15112:


Thanks [~ayushtkn] for the comments.

{quote}
The test expects unavailable RouterRpcClient.isUnavailableException(ioe) but 
the thrown is NoNamenodeException
{quote}
Yes, I messed up copying from the internal to the external branch.
Good news are that this shows that the unit test is doing its job.

{quote}
Secondly, Seems the jenkins didn't complained about this but the test failed at 
my local due to this, I think we should have refreshRoutersCaches(routers); 
after creating mount entry. SInce we are using random routers first for mount 
entry and then for filesystem.
{quote}
I changed the {{createMountTableEntry()}} to call all routers.

Fixes in  [^HDFS-15112.006.patch].

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.005.patch, 
> HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-14 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015083#comment-17015083
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 17s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterClientRejectOverload |
|   | hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990815/HDFS-15112.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1737b982168d 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1c51f36 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28660/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28660/testReport/ |
| Max. process+thread count | 2496 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-13 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014866#comment-17014866
 ] 

Ayush Saxena commented on HDFS-15112:
-

Thanx [~elgoiri]  for the patch. the UT in v004 seems to fail. There are two 
reasons :

first, The test expects unavailable 
{{RouterRpcClient.isUnavailableException(ioe)}} but the thrown is 
{{NoNamenodeException}}

Changing this :
{code:java}
+if (isUnavailableException(ioe)) {
+  // We cannot conclude if this is a proper exception
+  RemoteLocationContext loc = locations.get(i);
+  String nsId = loc.getNameserviceId();
+  throw new NoNamenodesAvailableException(nsId, ioe);
+}
{code}
to :
{code:java}
if (isUnavailableException(ioe)) {
  // We cannot conclude if this is a proper exception
  throw ioe;
}
{code}
 Works for me.

Secondly, Seems the jenkins didn't complained about this but the test failed at 
my local due to this, I think we should have 
{{refreshRoutersCaches({color:#660e7a}routers{color});}} after creating mount 
entry. SInce we are using random routers first for mount entry and then for 
filesystem.

 

Post these changes the UT seems to work fine, give a check once.

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.002.patch, HDFS-15112.004.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014650#comment-17014650
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 33s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990773/HDFS-15112.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bb9b09c1e0e9 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b86a51 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28655/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28655/testReport/ |
| Max. process+thread count | 2846 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28655/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014587#comment-17014587
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 19s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990754/HDFS-15112.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cf5738fd652f 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 621c5ea |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28654/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28654/testReport/ |
| Max. process+thread count | 2839 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28654/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (HDFS-15112) RBF: Do not return FileNotFoundException when a subcluster is unavailable

2020-01-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014532#comment-17014532
 ] 

Íñigo Goiri commented on HDFS-15112:


Thanks [~ayushtkn], I think that surfacing the unavailable exception is the way 
to go.
Take a look at [^HDFS-15112.001.patch].

> RBF: Do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.001.patch, 
> HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-12 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013824#comment-17013824
 ] 

Ayush Saxena commented on HDFS-15112:
-

In {{InvokeConcurrent}} there is a logic which requires to get response from 
all nameservices if {{requireResponse}} is true.

{code:java}
for (final RemoteResult result : results) {
  // Response from all servers required, use this error.
  if (requireResponse && result.hasException()) {
throw result.getException();
  }
{code}

It is returning the same exception which it got from the namespace, In case the 
nameservice is down and {{invokeConcurrent}} call is made with 
{{requireResponse}} as true, it will be returning the same exception as 
received by the namenode. 

Maybe we can do the same here too, if it is one of {{isUnavailableException()}} 
we give that exception a priority rather than the first received. That way at 
the client level also, if the same exception was encountered by the client 
while connecting to the namenode, if he retried or did a failover, he can do 
that similarly here and we will be safe from concluding also that the file 
actually doesn't exist or not. By having a retry, we may land up with a 
response too, if the problem was temporary or with one router only.

Another solution could be having a new Exception for the pourpose, or maybe the 
same NoNamenodeException, but these won't be unwrapped at the client side they 
would be all RemoteException only.

Whatever fits your use case shall be fine with me, if none, let me know, I will 
try to come up with some other idea. :)

> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013787#comment-17013787
 ] 

Íñigo Goiri commented on HDFS-15112:


[~ayushtkn], yes, that's the idea, to not give a false file not found.
That's the most important thing. 
After that, I'm not sure what is the best approach.
For now, I just started returning the easiest exception. 
Retry after some time might be an option.
Any proposal to what to return in this case? 

> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-12 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013762#comment-17013762
 ] 

Ayush Saxena commented on HDFS-15112:
-

Thanx [~elgoiri] for the report.
Just to catch the intent, is our motive to just denote to the client, that the 
file isn't lost but we are facing some cluster issues?
NoNamenodeException is Ok to denote, but I think this won't be retried, if the 
cluster issue is temporary or at only one router, we may consider it being 
retried on same router or maybe on another router.

> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013352#comment-17013352
 ] 

Hadoop QA commented on HDFS-15112:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HDFS-15112 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15112 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990623/HDFS-15112.000.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28645/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013350#comment-17013350
 ] 

Íñigo Goiri commented on HDFS-15112:


[^HDFS-15112.000.patch] adds a work in progress solution.

> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.000.patch, HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15112) RBF: do not return FileNotFoundException when a subcluster is unavailable

2020-01-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013319#comment-17013319
 ] 

Íñigo Goiri commented on HDFS-15112:


[^HDFS-15112.patch] shows a unit test with the issue.
The problem is users may get that the file does not exist while it is just 
unavailable.
Any suggestion on what exception we should return in this case?

> RBF: do not return FileNotFoundException when a subcluster is unavailable 
> --
>
> Key: HDFS-15112
> URL: https://issues.apache.org/jira/browse/HDFS-15112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-15112.patch
>
>
> If we have a mount point using HASH_ALL across two subclusters and one of 
> them is down, we may return FileNotFoundException while the file is just in 
> the unavailable subcluster.
> We should not return FileNotFoundException but something that shows that the 
> subcluster is unavailable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org