[jira] [Resolved] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-16076.
-
Fix Version/s: 3.3.2
   3.4.0
   Resolution: Fixed

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=614312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614312
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 24/Jun/21 02:30
Start Date: 24/Jun/21 02:30
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-867285100


   > Merged it. Thanks for your contribution, @tomscut.
   
   Thanks @tasanuma .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614312)
Time Spent: 2h 20m  (was: 2h 10m)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=614311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614311
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 24/Jun/21 02:28
Start Date: 24/Jun/21 02:28
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-867284477


   Merged it. Thanks for your contribution, @tomscut.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614311)
Time Spent: 2h 10m  (was: 2h)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=614310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614310
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 24/Jun/21 02:27
Start Date: 24/Jun/21 02:27
Worklog Time Spent: 10m 
  Work Description: tasanuma merged pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614310)
Time Spent: 2h  (was: 1h 50m)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16086) Add volume information to datanode log for tracing

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16086?focusedWorklogId=614300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614300
 ]

ASF GitHub Bot logged work on HDFS-16086:
-

Author: ASF GitHub Bot
Created on: 24/Jun/21 01:27
Start Date: 24/Jun/21 01:27
Worklog Time Spent: 10m 
  Work Description: tomscut opened a new pull request #3136:
URL: https://github.com/apache/hadoop/pull/3136


   JIRA: [HDFS-16086](https://issues.apache.org/jira/browse/HDFS-16086)
   
   To keep track of the block in volume, we can add the volume information to 
the datanode log.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614300)
Remaining Estimate: 0h
Time Spent: 10m

> Add volume information to datanode log for tracing
> --
>
> Key: HDFS-16086
> URL: https://issues.apache.org/jira/browse/HDFS-16086
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To keep track of the block in volume, we can add the volume information to 
> the datanode log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16086) Add volume information to datanode log for tracing

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16086:
--
Labels: pull-request-available  (was: )

> Add volume information to datanode log for tracing
> --
>
> Key: HDFS-16086
> URL: https://issues.apache.org/jira/browse/HDFS-16086
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To keep track of the block in volume, we can add the volume information to 
> the datanode log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16086) Add volume information to datanode log for tracing

2021-06-23 Thread tomscut (Jira)
tomscut created HDFS-16086:
--

 Summary: Add volume information to datanode log for tracing
 Key: HDFS-16086
 URL: https://issues.apache.org/jira/browse/HDFS-16086
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: tomscut
Assignee: tomscut


To keep track of the block in volume, we can add the volume information to the 
datanode log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2021-06-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368455#comment-17368455
 ] 

Hadoop QA commented on HDFS-15659:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m  
1s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 
13 new or modified test files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
23s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  4s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 
52s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  2m 
54s{color} | {color:green}{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 15s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  2m 
59s{color} | {color:green}{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}192m 35s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/644/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt{color}
 | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} The patch does not generate 
ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}280m 17s{color} | 
{color:black}{color} | {color:black}{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.41 ServerAPI=1.41 base: 

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-06-23 Thread Chenren Shao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368435#comment-17368435
 ] 

Chenren Shao commented on HDFS-14099:
-

Taking a deeper look at HADOOP-17096 and found this fix only affects 
compression. I am not sure how it could impact decompression issue that I 
encounter here. 

[~xuzq_zander] when you did your test, which patch did you use: 
[https://patch-diff.githubusercontent.com/raw/apache/hadoop/pull/441.patch]
or [^HDFS-14099-trunk-003.patch] ? In my previous test, we used the latter and 
still got the error of 

```

{{java.lang.InternalError: Unknown frame descriptor
at 
org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
 Method)
at 
org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
at 
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)}}

{{```}}

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover 

[jira] [Resolved] (HDFS-16072) TestBlockRecovery fails consistently on Branch-2.10

2021-06-23 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein resolved HDFS-16072.
--
Fix Version/s: 2.10.2
   Resolution: Fixed

> TestBlockRecovery fails consistently on Branch-2.10
> ---
>
> Key: HDFS-16072
> URL: https://issues.apache.org/jira/browse/HDFS-16072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 2.10.1
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestBlockRecovery}} fails on branch-2.10 consistently.
> I found that the failures were reported in the Qbt-Reports since March 2021 
> to say the least.
> {code:bash}
> [ERROR] Tests run: 20, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 
> 21.422 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
> [ERROR] 
> testRaceBetweenReplicaRecoveryAndFinalizeBlock(org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery)
>   Time elapsed: 2.814 s  <<< ERROR!
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.IllegalThreadStateException
>   at 
> org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:656)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:607)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy28.getDatanodeReport(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:2132)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2699)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2743)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1723)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:905)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:517)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:476)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery.testRaceBetweenReplicaRecoveryAndFinalizeBlock(TestBlockRecovery.java:694)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:607)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.IllegalThreadStateException
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:244)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy27.getDatanodeReport(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:653)
>   ... 30 more
> Caused by: 

[jira] [Comment Edited] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2021-06-23 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368320#comment-17368320
 ] 

Ahmed Hussein edited comment on HDFS-15659 at 6/23/21, 4:33 PM:


Thanks [~Jim_Brennan].
 I submitted patches backporting the changes to branch-2.10, and branch-3.2.

*branch-3.2:*

Minor conflict in {{MiniDFSCluster}} due to imports

*branch-2.10*
 * Minor conflict in {{MiniDFSCluster}} due to imports
 * {{DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_CONSIDERLOAD_KEY}} was named 
{{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY}} in branch-2.10
 * Other conflicts were due to the fact that the {{Striped}} feature is not 
implemented in branch-2.10. Hence, the unit tests do not exist in branch-2.10.

The failures reported by Yetus are not related to the changes applied to 
branch-2.10. As a matter of fact, TestBlockRecovery has already been fixed 
couple of hours ago by HADOOP-17769 


was (Author: ahussein):
Thanks [~Jim_Brennan].
 I submitted patches backporting the changes to branch-2.10, and branch-3.2.

*branch-3.2:*

Minor conflict in {{MiniDFSCluster}} due to imports

*branch-2.10*
 * Minor conflict in {{MiniDFSCluster}} due to imports
 * Other conflicts were due to the fact that the {{Striped}} feature is not 
implemented in branch-2.10. Hence, the unit tests do not exist in branch-2.10.

The failures reported by Yetus are not related to the changes applied to 
branch-2.10. As a matter of fact, TestBlockRecovery has already been fixed 
couple of hours ago by HADOOP-17769 

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-15659-branch-2.10.001.patch, 
> HDFS-15659-branch-3.2.001.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2021-06-23 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368320#comment-17368320
 ] 

Ahmed Hussein edited comment on HDFS-15659 at 6/23/21, 4:10 PM:


Thanks [~Jim_Brennan].
 I submitted patches backporting the changes to branch-2.10, and branch-3.2.

*branch-3.2:*

Minor conflict in {{MiniDFSCluster}} due to imports

*branch-2.10*
 * Minor conflict in {{MiniDFSCluster}} due to imports
 * Other conflicts were due to the fact that the {{Striped}} feature is not 
implemented in branch-2.10. Hence, the unit tests do not exist in branch-2.10.

The failures reported by Yetus are not related to the changes applied to 
branch-2.10. As a matter of fact, TestBlockRecovery has already been fixed 
couple of hours ago by HADOOP-17769 


was (Author: ahussein):
Thanks [~Jim_Brennan].
I submitted patches backporting the changes to branch-2.10, and branch-3.2.

*branch-3.2:*

Minor conflict in {{MiniDFSCluster}} due to imports 

*branch-2.10*

* Minor conflict in {{MiniDFSCluster}} due to imports 
* Other conflicts were due to the fact that the {{Striped}} feature is not 
implemented in branch-2.10. Hence, the unit tests do not exist in branch-2.10.

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-15659-branch-2.10.001.patch, 
> HDFS-15659-branch-3.2.001.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2021-06-23 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15659:
-
Attachment: HDFS-15659-branch-3.2.001.patch
Status: Patch Available  (was: In Progress)

Thanks [~Jim_Brennan].
I submitted patches backporting the changes to branch-2.10, and branch-3.2.

*branch-3.2:*

Minor conflict in {{MiniDFSCluster}} due to imports 

*branch-2.10*

* Minor conflict in {{MiniDFSCluster}} due to imports 
* Other conflicts were due to the fact that the {{Striped}} feature is not 
implemented in branch-2.10. Hence, the unit tests do not exist in branch-2.10.

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-15659-branch-2.10.001.patch, 
> HDFS-15659-branch-3.2.001.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2021-06-23 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15659:
-
Status: In Progress  (was: Patch Available)

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-15659-branch-2.10.001.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16039) RBF: Some indicators of RBFMetrics count inaccurately

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16039?focusedWorklogId=614063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614063
 ]

ASF GitHub Bot logged work on HDFS-16039:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 15:13
Start Date: 23/Jun/21 15:13
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #3086:
URL: https://github.com/apache/hadoop/pull/3086#discussion_r657203156



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/metrics/TestRBFMetrics.java
##
@@ -382,4 +366,56 @@ private void testCapacity(FederationMBean bean) throws 
IOException {
 assertNotEquals(availableCapacity,
 BigInteger.valueOf(bean.getRemainingCapacity()));
   }
+
+  @Test
+  public void testDatanodeNumMetrics()
+  throws Exception {
+Configuration routerConf = new RouterConfigBuilder()
+.metrics()
+.http()
+.stateStore()
+.rpc()
+.build();
+MiniRouterDFSCluster cluster = new MiniRouterDFSCluster(false, 1);
+cluster.setNumDatanodesPerNameservice(0);
+cluster.addNamenodeOverrides(routerConf);
+cluster.startCluster();
+routerConf.setTimeDuration(
+RBFConfigKeys.DN_REPORT_CACHE_EXPIRE, 1, TimeUnit.SECONDS);
+cluster.addRouterOverrides(routerConf);
+cluster.startRouters();
+Router router = cluster.getRandomRouter().getRouter();
+// Register and verify all NNs with all routers
+cluster.registerNamenodes();
+cluster.waitNamenodeRegistration();
+RouterRpcServer rpcServer = router.getRpcServer();
+RBFMetrics rbfMetrics = router.getMetrics();
+// Create mock dn
+DatanodeInfo[] dNInfo = new DatanodeInfo[4];
+DatanodeInfo datanodeInfo = new DatanodeInfo.DatanodeInfoBuilder().build();
+datanodeInfo.setDecommissioned();
+dNInfo[0] = datanodeInfo;
+datanodeInfo = new DatanodeInfo.DatanodeInfoBuilder().build();
+datanodeInfo.setInMaintenance();
+dNInfo[1] = datanodeInfo;
+datanodeInfo = new DatanodeInfo.DatanodeInfoBuilder().build();
+datanodeInfo.startMaintenance();
+dNInfo[2] = datanodeInfo;
+datanodeInfo = new DatanodeInfo.DatanodeInfoBuilder().build();
+datanodeInfo.startDecommission();
+dNInfo[3] = datanodeInfo;
+
+rpcServer.getDnCache().put(HdfsConstants.DatanodeReportType.LIVE, dNInfo);

Review comment:
   This is a little unconventional.
   You should mark the getter as VisibleForTesting.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/metrics/RBFMetrics.java
##
@@ -164,13 +163,13 @@ public RBFMetrics(Router router) throws IOException {
   RouterStore.class);
 }
 
-// Initialize the cache for the DN reports
 Configuration conf = router.getConfig();
-this.timeOut = conf.getTimeDuration(RBFConfigKeys.DN_REPORT_TIME_OUT,
-RBFConfigKeys.DN_REPORT_TIME_OUT_MS_DEFAULT, TimeUnit.MILLISECONDS);
 this.topTokenRealOwners = conf.getInt(
 RBFConfigKeys.DFS_ROUTER_METRICS_TOP_NUM_TOKEN_OWNERS_KEY,
 RBFConfigKeys.DFS_ROUTER_METRICS_TOP_NUM_TOKEN_OWNERS_KEY_DEFAULT);
+
+// Use RpcServer dnCache
+this.dnCache = this.router.getRpcServer().getDnCache();

Review comment:
   No much benefit getting and setting into an attribute.
   We can do this get the times we need to access.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1757,7 +1757,7 @@ public void testRBFMetricsMethodsRelayOnStateStore() {
 // These methods relays on
 // {@link RBFMetrics#getActiveNamenodeRegistration()}
 assertEquals("{}", metrics.getNameservices());
-assertEquals(0, metrics.getNumLiveNodes());
+assertEquals(NUM_DNS * 2, metrics.getNumLiveNodes());

Review comment:
   Why now this is like this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614063)
Time Spent: 2h 10m  (was: 2h)

> RBF:  Some indicators of RBFMetrics count inaccurately
> --
>
> Key: HDFS-16039
> URL: https://issues.apache.org/jira/browse/HDFS-16039
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 

[jira] [Work logged] (HDFS-16085) Move the getPermissionChecker out of the read lock

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16085?focusedWorklogId=614055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614055
 ]

ASF GitHub Bot logged work on HDFS-16085:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 14:59
Start Date: 23/Jun/21 14:59
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3134:
URL: https://github.com/apache/hadoop/pull/3134#issuecomment-866913600


   > +1
   
   Thanks @ayushtkn for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614055)
Time Spent: 40m  (was: 0.5h)

> Move the getPermissionChecker out of the read lock
> --
>
> Key: HDFS-16085
> URL: https://issues.apache.org/jira/browse/HDFS-16085
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Move the getPermissionChecker out of the read lock in 
> NamenodeFsck#getBlockLocations() since the operation does not need to be 
> locked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13295) Namenode doesn't leave safemode if dfs.namenode.safemode.replication.min set < dfs.namenode.replication.min

2021-06-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368251#comment-17368251
 ] 

Hadoop QA commented on HDFS-13295:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
36s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 36s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 21m  
5s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  3m  
7s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/643/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt{color}
 | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 41 new + 
272 unchanged - 0 fixed = 313 total (was 272) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} the 

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-06-23 Thread Chenren Shao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368231#comment-17368231
 ] 

Chenren Shao commented on HDFS-14099:
-

Thank you, [~xuzq_zander] and [~weichiu]. As you suspected, I tried the patch 
on the top of 3.2.1, so it is very likely that HADOOP-17096 was the issue. I 
will apply the patch for HADOOP-17096 and try again.

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   return -1;
> }
> decompressor.reset();
>

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-06-23 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368103#comment-17368103
 ] 

Wei-Chiu Chuang commented on HDFS-14099:


It might have been fixed by HADOOP-17096.
can you apply the patch attached there or use 3.2.2, 3.3.1 instead?

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   return -1;
> }
> decompressor.reset();
> decompressor.setInput(buffer, 0, m);
> lastBytesSent = m;
>   } else {
> // looks 

[jira] [Commented] (HDFS-14099) Unknown frame descriptor when decompressing multiple frames in ZStandardDecompressor

2021-06-23 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368088#comment-17368088
 ] 

xuzq commented on HDFS-14099:
-

Thanks [~cshao239].

I test the output.zst by `hadoop fs -text output.zst` and it can be 
successfully decompressed, and the content like "key": "value1496976".

 

What problem happen when apply? 

> Unknown frame descriptor when decompressing multiple frames in 
> ZStandardDecompressor
> 
>
> Key: HDFS-14099
> URL: https://issues.apache.org/jira/browse/HDFS-14099
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Hadoop Version: hadoop-3.0.3
> Java Version: 1.8.0_144
>Reporter: xuzq
>Assignee: xuzq
>Priority: Major
> Attachments: HDFS-14099-trunk-001.patch, HDFS-14099-trunk-002.patch, 
> HDFS-14099-trunk-003.patch
>
>
> We need to use the ZSTD compression algorithm in Hadoop. So I write a simple 
> demo like this for testing.
> {code:java}
> // code placeholder
> while ((size = fsDataInputStream.read(bufferV2)) > 0 ) {
>   countSize += size;
>   if (countSize == 65536 * 8) {
> if(!isFinished) {
>   // finish a frame in zstd
>   cmpOut.finish();
>   isFinished = true;
> }
> fsDataOutputStream.flush();
> fsDataOutputStream.hflush();
>   }
>   if(isFinished) {
> LOG.info("Will resetState. N=" + n);
> // reset the stream and write again
> cmpOut.resetState();
> isFinished = false;
>   }
>   cmpOut.write(bufferV2, 0, size);
>   bufferV2 = new byte[5 * 1024 * 1024];
>   n++;
> }
> {code}
>  
> And I use "*hadoop fs -text*"  to read this file and failed. The error as 
> blow.
> {code:java}
> Exception in thread "main" java.lang.InternalError: Unknown frame descriptor
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.inflateBytesDirect(Native
>  Method)
> at 
> org.apache.hadoop.io.compress.zstd.ZStandardDecompressor.decompress(ZStandardDecompressor.java:181)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:111)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:98)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:66)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:127)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> {code}
>  
> So I had to look the code, include jni, then found this bug.
> *ZSTD_initDStream(stream)* method may by called twice in the same *Frame*.
> The first is  in *ZStandardDecompressor.c.* 
> {code:java}
> if (size == 0) {
> (*env)->SetBooleanField(env, this, ZStandardDecompressor_finished, 
> JNI_TRUE);
> size_t result = dlsym_ZSTD_initDStream(stream);
> if (dlsym_ZSTD_isError(result)) {
> THROW(env, "java/lang/InternalError", 
> dlsym_ZSTD_getErrorName(result));
> return (jint) 0;
> }
> }
> {code}
> This call here is correct, but *Finished* no longer be set to false, even if 
> there is some datas (a new frame) in *CompressedBuffer* or *UserBuffer* need 
> to be decompressed.
> The second is in *org.apache.hadoop.io.compress.DecompressorStream* by 
> *decompressor.reset()*, because *Finished* is always true after decompressed 
> a *Frame*.
> {code:java}
> if (decompressor.finished()) {
>   // First see if there was any leftover buffered input from previous
>   // stream; if not, attempt to refill buffer.  If refill -> EOF, we're
>   // all done; else reset, fix up input buffer, and get ready for next
>   // concatenated substream/"member".
>   int nRemaining = decompressor.getRemaining();
>   if (nRemaining == 0) {
> int m = getCompressedData();
> if (m == -1) {
>   // apparently the previous end-of-stream was also end-of-file:
>   // return success, as if we had never called getCompressedData()
>   eof = true;
>   return -1;
> }
> decompressor.reset();
> 

[jira] [Commented] (HDFS-16081) List a large directory, the client waits for a long time

2021-06-23 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368075#comment-17368075
 ] 

lei w commented on HDFS-16081:
--

OK,Thanks [~ste...@apache.org] reply.

> List a large directory, the client waits for a long time
> 
>
> Key: HDFS-16081
> URL: https://issues.apache.org/jira/browse/HDFS-16081
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: lei w
>Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of files 
> and return it to the client after obtaining the specified number of files  or 
> NameNode returnes files based on lock hold time instead of just relying on a 
> configuration. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16081) List a large directory, the client waits for a long time

2021-06-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368067#comment-17368067
 ] 

Steve Loughran commented on HDFS-16081:
---

FWIW I do like the idea of being able to control page size in listings, but 
this would have to be part of a successor to HDFS-13616; which wants to 
coalesce multiple dir listings. It'd be a new API call which would return a 
List>> ; each iterator being for 
one directory. 
  
Big piece of work, for which we'd need volunteers :)


> List a large directory, the client waits for a long time
> 
>
> Key: HDFS-16081
> URL: https://issues.apache.org/jira/browse/HDFS-16081
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: lei w
>Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of files 
> and return it to the client after obtaining the specified number of files  or 
> NameNode returnes files based on lock hold time instead of just relying on a 
> configuration. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16081) List a large directory, the client waits for a long time

2021-06-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368061#comment-17368061
 ] 

Steve Loughran commented on HDFS-16081:
---

I mean {{FileSystem.listStatusIterator(final Path p)}} which returns 
RemoteIterator , which if you follow the DFSClient chain ends up 
doing paged listings of size <= "dfs.ls.limit", whose default is 1000.

Use this API call and the first page fetched will only be 1000 objects.

{code}
// create iterator. on HDFS will block for the first 1000 entries only
RemoteIterator it = fs.listStatusIterator(new Path("/tmp");
while (it.hasNext()) {
  FileStatus st = it.next();
  ...
}
{code}



> List a large directory, the client waits for a long time
> 
>
> Key: HDFS-16081
> URL: https://issues.apache.org/jira/browse/HDFS-16081
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: lei w
>Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of files 
> and return it to the client after obtaining the specified number of files  or 
> NameNode returnes files based on lock hold time instead of just relying on a 
> configuration. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16083) Forbid Observer NameNode trigger active namenode log roll

2021-06-23 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368017#comment-17368017
 ] 

lei w commented on HDFS-16083:
--

Thanks [~LiJinglun] reply ,I will add more information and add unit tests later

> Forbid Observer NameNode trigger  active namenode log roll
> --
>
> Key: HDFS-16083
> URL: https://issues.apache.org/jira/browse/HDFS-16083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
> Attachments: HDFS-16083.001.patch, HDFS-16083.002.patch
>
>
> When the Observer NameNode is turned on in the cluster, the Active NameNode 
> will receive rollEditLog RPC requests from the Standby NameNode and Observer 
> NameNode in a short time. Observer NameNode's rollEditLog request is a 
> repetitive operation, so should we prohibit Observer NameNode from triggering 
> rollEditLog?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16083) Forbid Observer NameNode trigger active namenode log roll

2021-06-23 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368012#comment-17368012
 ] 

Jinglun commented on HDFS-16083:


Hi [~lei w], thanks your report ! The description makes sense to me. I have a 
quick look of rollEdit, seems the redundant roll edit did exist. Could you add 
some logs of the active NameNode showing it actually rollEdit more frequently 
than configured in 'dfs.ha.log-roll.period'. Also we need a unit test in the 
patch to make is solid.

> Forbid Observer NameNode trigger  active namenode log roll
> --
>
> Key: HDFS-16083
> URL: https://issues.apache.org/jira/browse/HDFS-16083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
> Attachments: HDFS-16083.001.patch, HDFS-16083.002.patch
>
>
> When the Observer NameNode is turned on in the cluster, the Active NameNode 
> will receive rollEditLog RPC requests from the Standby NameNode and Observer 
> NameNode in a short time. Observer NameNode's rollEditLog request is a 
> repetitive operation, so should we prohibit Observer NameNode from triggering 
> rollEditLog?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16083) Forbid Observer NameNode trigger active namenode log roll

2021-06-23 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun reassigned HDFS-16083:
--

Assignee: lei w

> Forbid Observer NameNode trigger  active namenode log roll
> --
>
> Key: HDFS-16083
> URL: https://issues.apache.org/jira/browse/HDFS-16083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
> Attachments: HDFS-16083.001.patch, HDFS-16083.002.patch
>
>
> When the Observer NameNode is turned on in the cluster, the Active NameNode 
> will receive rollEditLog RPC requests from the Standby NameNode and Observer 
> NameNode in a short time. Observer NameNode's rollEditLog request is a 
> repetitive operation, so should we prohibit Observer NameNode from triggering 
> rollEditLog?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13295) Namenode doesn't leave safemode if dfs.namenode.safemode.replication.min set < dfs.namenode.replication.min

2021-06-23 Thread William Montaz (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368002#comment-17368002
 ] 

William Montaz commented on HDFS-13295:
---

Hello, I revive this ticket as the error is still present in version 3.3.0

During the standby loading phase, a block can show up with its numStorage > 
minReplication. Thus, the value of minReplication is provided to 
bmSafeMode.incrementSafeBlockCount

Since minReplication can be higher than safeReplication, the block will never 
be accounted as safe and the total will be incremented => the standby never 
goes out of safe mode.

 

> Namenode doesn't leave safemode if dfs.namenode.safemode.replication.min set 
> < dfs.namenode.replication.min
> ---
>
> Key: HDFS-13295
> URL: https://issues.apache.org/jira/browse/HDFS-13295
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
> Environment: CDH 5.11 with HDFS-8716 backported.
> dfs.namenode.replication.min=2
> dfs.namenode.safemode.replication.min=1
>  
>Reporter: Nicolas Fraison
>Assignee: Nicolas Fraison
>Priority: Major
> Attachments: HDFS-13295.1.patch, HDFS-13295.2.patch, HDFS-13295.patch
>
>
> When we set dfs.namenode.safemode.replication.min < 
> dfs.namenode.replication.min from HDFS-8716 patch the number of replica for 
> which it will increase the safe block count
> must be equal to dfs.namenode.safemode.replication.min in 
> `FSNamesystem.incrementSafeBlockCount`
> When reading modification from edits, the replica number for new blocks is 
> set at min(numNodes,
> dfs.namenode.replication.min) in BlockManager.completeBlock which is greater 
> than dfs.namenode.safemode.replication.min.
> Due to that safe block count never reach number of available blocks and 
> namenode doesn't leave automatically the safemode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=613913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613913
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 09:29
Start Date: 23/Jun/21 09:29
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-866681953


   Thanks @tasanuma for your second review. These failed UTs work fine locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613913)
Time Spent: 1h 50m  (was: 1h 40m)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=613895=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613895
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 08:43
Start Date: 23/Jun/21 08:43
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-866649231


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  18m 43s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 346m 40s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3117/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 438m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus 
|
   |   | hadoop.hdfs.TestDFSShell |
   |   | 
hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
   |   | hadoop.hdfs.server.mover.TestMover |
   |   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3117/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3117 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 28e07ef0078f 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 74bb4aa36cc2f934905878f70762e628de073e93 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 

[jira] [Commented] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS

2021-06-23 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367960#comment-17367960
 ] 

Wei-Chiu Chuang commented on HDFS-13916:


Looks fine to me.

I'm still hold the same opinion that we should ultimately support the 
getSnapshotDiffReportListing API instead. As an example: 
https://github.com/apache/hadoop/blob/eefa664fea1119a9c6e3ae2d2ad3069019fbd4ef/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java#L2391-L2413

When there are millions of diffs between two snapshots, the old 
getSnapshotDiffReport() isn't scalable. NameNode find itself creating huge RPC 
messages for the snapshot diff items, which creates GC memory pressure; 
application produces big memory spikes too.

We don't have the getSnapshotDiffReportListing API support in webhdfs now 
though.

> Distcp SnapshotDiff to support WebHDFS
> --
>
> Key: HDFS-13916
> URL: https://issues.apache.org/jira/browse/HDFS-13916
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp, webhdfs
>Affects Versions: 3.0.1, 3.1.1
>Reporter: Xun REN
>Assignee: Xun REN
>Priority: Major
>  Labels: easyfix, newbie, patch
> Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, 
> HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, 
> HDFS-13916.007.patch, HDFS-13916.patch
>
>
> [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to 
> make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, 
> there is no modification for the real java class which is used by launching 
> the command "hadoop distcp ..."
>  
> You can check in the latest version here:
> [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100]
> In the method "preSyncCheck" of the class "DistCpSync", we still check if the 
> file system is DFS. 
> So I propose to change the class DistCpSync in order to take into 
> consideration what was committed by Lokesh Jain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16085) Move the getPermissionChecker out of the read lock

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16085?focusedWorklogId=613840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613840
 ]

ASF GitHub Bot logged work on HDFS-16085:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 07:22
Start Date: 23/Jun/21 07:22
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3134:
URL: https://github.com/apache/hadoop/pull/3134#issuecomment-866596347


   Hi @tasanuma @jojochuang @Hexiaoqiao , could you please take a look. Thanks. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613840)
Time Spent: 0.5h  (was: 20m)

> Move the getPermissionChecker out of the read lock
> --
>
> Key: HDFS-16085
> URL: https://issues.apache.org/jira/browse/HDFS-16085
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Move the getPermissionChecker out of the read lock in 
> NamenodeFsck#getBlockLocations() since the operation does not need to be 
> locked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16085) Move the getPermissionChecker out of the read lock

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16085?focusedWorklogId=613836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613836
 ]

ASF GitHub Bot logged work on HDFS-16085:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 07:16
Start Date: 23/Jun/21 07:16
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3134:
URL: https://github.com/apache/hadoop/pull/3134#issuecomment-866592953


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 19s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m 51s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 232m 53s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  5s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 317m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3134/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3134 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ec760d8d6ec5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0ddcd2458c2801a82b6da1c53bdb88bef2de0659 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3134/1/testReport/ |
   | Max. process+thread count | 3180 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3134/1/console |
   | versions | git=2.25.1 

[jira] [Commented] (HDFS-16083) Forbid Observer NameNode trigger active namenode log roll

2021-06-23 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367906#comment-17367906
 ] 

lei w commented on HDFS-16083:
--

[~ayushsaxena] , [~LiJinglun] anyone have any suggestions?

> Forbid Observer NameNode trigger  active namenode log roll
> --
>
> Key: HDFS-16083
> URL: https://issues.apache.org/jira/browse/HDFS-16083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Reporter: lei w
>Priority: Minor
> Attachments: HDFS-16083.001.patch, HDFS-16083.002.patch
>
>
> When the Observer NameNode is turned on in the cluster, the Active NameNode 
> will receive rollEditLog RPC requests from the Standby NameNode and Observer 
> NameNode in a short time. Observer NameNode's rollEditLog request is a 
> repetitive operation, so should we prohibit Observer NameNode from triggering 
> rollEditLog?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16081) List a large directory, the client waits for a long time

2021-06-23 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367889#comment-17367889
 ] 

lei w commented on HDFS-16081:
--

Thanks [~ste...@apache.org] comment.  We use original DFSClient API , not found 
listStatusIncremental() or you mean listStatusInternal() ?  The method 
listStatusInternal() will only return after accepting all the files, so it will 
wait a long time when there are too many files in a directory. We want add a 
limit on the number of files and return it to the client after obtaining the 
specified number of files.

> List a large directory, the client waits for a long time
> 
>
> Key: HDFS-16081
> URL: https://issues.apache.org/jira/browse/HDFS-16081
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: lei w
>Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of files 
> and return it to the client after obtaining the specified number of files  or 
> NameNode returnes files based on lock hold time instead of just relying on a 
> configuration. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16081) List a large directory, the client waits for a long time

2021-06-23 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-16081:
-
Description: When we list a large directory, we need to wait a lot of time. 
This is because the NameNode only returns the number of files corresponding to 
dfs.ls.limit each time, and then the client iteratively obtains the remaining 
files. But in many scenarios, we only need to know part of the files in the 
current directory, and then process this part of the file. After processing, go 
to get the remaining files. So can we add a limit on the number of files and 
return it to the client after obtaining the specified number of files  or 
NameNode returnes files based on lock hold time instead of just relying on a 
configuration.   (was: When we list a large directory, we need to wait a lot of 
time. This is because the NameNode only returns the number of files 
corresponding to dfs.ls.limit each time, and then the client iteratively 
obtains the remaining files. But in many scenarios, we only need to know part 
of the files in the current directory, and then process this part of the file. 
After processing, go to get the remaining files. So can we add a limit on the 
number of ls files and return it to the client after obtaining the specified 
number of files  or NameNode returnes files based on lock hold time instead of 
just relying on a configuration. )

> List a large directory, the client waits for a long time
> 
>
> Key: HDFS-16081
> URL: https://issues.apache.org/jira/browse/HDFS-16081
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: lei w
>Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of files 
> and return it to the client after obtaining the specified number of files  or 
> NameNode returnes files based on lock hold time instead of just relying on a 
> configuration. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16039) RBF: Some indicators of RBFMetrics count inaccurately

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16039?focusedWorklogId=613816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613816
 ]

ASF GitHub Bot logged work on HDFS-16039:
-

Author: ASF GitHub Bot
Created on: 23/Jun/21 06:04
Start Date: 23/Jun/21 06:04
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3086:
URL: https://github.com/apache/hadoop/pull/3086#issuecomment-866554573


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 49s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  25m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   7m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   7m 53s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |  34m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  46m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  21m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  5s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  22m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 27s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3086/5/artifact/out/blanks-eol.txt)
 |  The patch has 8 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   3m 47s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  20m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   7m 54s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   8m  2s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |  35m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  47m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 794m 42s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3086/5/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | -1 :x: |  asflicense  |   1m 25s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3086/5/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 1123m  1s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
   |   | hadoop.yarn.server.router.clientrm.TestFederationClientInterceptor |
   |   | hadoop.tools.dynamometer.TestDynamometerInfra |
   |   | hadoop.hdfs.server.federation.router.TestRouter |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.federation.router.TestRouterRPCClientRetries |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.router.TestRouterClientRejectOverload |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3086/5/artifact/out/Dockerfile
 |
   | GITHUB PR |