[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242958#comment-17242958 ] Sixiang Ma commented on HDFS-15705: --- [~weichiu] I saw your commit. I got a little busy tonight and didn't submit the PR... Thanks for your help with my first contribution to HDFS! Feeling excited. I will probably submit a few more interesting bugs in the near future, as I just finished a project regarding testing heterogenous configuration in Hadoop/HBase. > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Assignee: Sixiang Ma >Priority: Trivial > Fix For: 3.4.0 > > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13831) Make block increment deletion number configurable
[ https://issues.apache.org/jira/browse/HDFS-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242953#comment-17242953 ] GeoffreyStark commented on HDFS-13831: -- Well, thank you very much. I see. That's what I was thinking yesterday:D > Make block increment deletion number configurable > - > > Key: HDFS-13831 > URL: https://issues.apache.org/jira/browse/HDFS-13831 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: Yiqun Lin >Assignee: Ryan Wu >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2 > > Attachments: HDFS-13831.001.patch, HDFS-13831.002.patch, > HDFS-13831.003.patch, HDFS-13831.004.patch, HDFS-13831.branch-3.0.001.patch > > > When NN deletes a large directory, it will hold the write lock long time. For > improving this, we remove the blocks in a batch way. So that other waiters > have a chance to get the lock. But right now, the batch number is a > hard-coded value. > {code} > static int BLOCK_DELETION_INCREMENT = 1000; > {code} > We can make this value configurable, so that we can control the frequency of > other waiters to get the lock chance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15705: --- Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Assignee: Sixiang Ma >Priority: Trivial > Fix For: 3.4.0 > > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242905#comment-17242905 ] Hadoop QA commented on HDFS-15660: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 27s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} buf {color} | {color:blue} 0m 1s{color} | {color:blue}{color} | {color:blue} buf was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 43m 51s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 12s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s{color} | {color:green}{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 59s{color} | {color:black}{color} | {color:black}{color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/326/artifact/out/Dockerfile | | JIRA Issue | HDFS-15660 | | JIRA Patch URL |
[jira] [Updated] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Wu updated HDFS-15660: --- Attachment: (was: HDFS-15660.003.patch) > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Attachments: HDFS-15660.002.patch, HDFS-15660.003.patch > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Wu updated HDFS-15660: --- Attachment: HDFS-15660.003.patch > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Attachments: HDFS-15660.002.patch, HDFS-15660.003.patch > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To
[jira] [Commented] (HDFS-13831) Make block increment deletion number configurable
[ https://issues.apache.org/jira/browse/HDFS-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242880#comment-17242880 ] Ryan Wu commented on HDFS-13831: IMO you could make the loop faster to hold the lock for a short time. > Make block increment deletion number configurable > - > > Key: HDFS-13831 > URL: https://issues.apache.org/jira/browse/HDFS-13831 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: Yiqun Lin >Assignee: Ryan Wu >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2 > > Attachments: HDFS-13831.001.patch, HDFS-13831.002.patch, > HDFS-13831.003.patch, HDFS-13831.004.patch, HDFS-13831.branch-3.0.001.patch > > > When NN deletes a large directory, it will hold the write lock long time. For > improving this, we remove the blocks in a batch way. So that other waiters > have a chance to get the lock. But right now, the batch number is a > hard-coded value. > {code} > static int BLOCK_DELETION_INCREMENT = 1000; > {code} > We can make this value configurable, so that we can control the frequency of > other waiters to get the lock chance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242857#comment-17242857 ] Takanobu Asanuma commented on HDFS-14353: - Seems 3.3.0 doesn't include this jira. Fixed Fix Versions. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Major > Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > HDFS-14353.009.patch, HDFS-14353.010.patch, screenshot-1.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-14353: Fix Version/s: (was: 3.3.0) 3.3.1 > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: Baolong Mao >Assignee: Baolong Mao >Priority: Major > Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > HDFS-14353.009.patch, HDFS-14353.010.patch, screenshot-1.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242842#comment-17242842 ] Takanobu Asanuma commented on HDFS-15240: - I confirmed that the patch also worked well with ISA-L. (CI doesn't use ISA-L since HADOOP-17224 is reverted.) +1 on [^HDFS-15240.013.patch]. Thanks! > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at >
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242837#comment-17242837 ] Hui Fei commented on HDFS-15240: Thanks [~marvelrock] for report and fix, [~umamaheswararao] [~tasanuma] for review , [~touchida] for test, and all here! Will commit tomorrow if no other comments! > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at >
[jira] [Commented] (HDFS-14904) Add Option to let Balancer prefer highly utilized nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242819#comment-17242819 ] Leon Gao commented on HDFS-14904: - Thanks for the review! [~jingzhao] > Add Option to let Balancer prefer highly utilized nodes in each iteration > - > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14904) Add Option to let Balancer prefer highly utilized nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242816#comment-17242816 ] Jing Zhao commented on HDFS-14904: -- +1. I've committed the change. Thank you for the contribution, [~LeonG]! > Add Option to let Balancer prefer highly utilized nodes in each iteration > - > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14904) Add Option to let Balancer prefer highly utilized nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-14904. -- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add Option to let Balancer prefer highly utilized nodes in each iteration > - > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14904) Add Option to let Balancer prefer highly utilized nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-14904: - Summary: Add Option to let Balancer prefer highly utilized nodes in each iteration (was: Option to let Balancer prefer top used nodes in each iteration) > Add Option to let Balancer prefer highly utilized nodes in each iteration > - > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-14904) Option to let Balancer prefer top used nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?focusedWorklogId=519304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-519304 ] ASF GitHub Bot logged work on HDFS-14904: - Author: ASF GitHub Bot Created on: 02/Dec/20 23:53 Start Date: 02/Dec/20 23:53 Worklog Time Spent: 10m Work Description: Jing9 merged pull request #2483: URL: https://github.com/apache/hadoop/pull/2483 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 519304) Time Spent: 1h 40m (was: 1.5h) > Option to let Balancer prefer top used nodes in each iteration > -- > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-14904) Option to let Balancer prefer top used nodes in each iteration
[ https://issues.apache.org/jira/browse/HDFS-14904?focusedWorklogId=519303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-519303 ] ASF GitHub Bot logged work on HDFS-14904: - Author: ASF GitHub Bot Created on: 02/Dec/20 23:50 Start Date: 02/Dec/20 23:50 Worklog Time Spent: 10m Work Description: Jing9 commented on pull request #2483: URL: https://github.com/apache/hadoop/pull/2483#issuecomment-737565060 The latest change looks good to me. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 519303) Time Spent: 1.5h (was: 1h 20m) > Option to let Balancer prefer top used nodes in each iteration > -- > > Key: HDFS-14904 > URL: https://issues.apache.org/jira/browse/HDFS-14904 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Normally the most important purpose for HDFS balancer is to reduce the top > used node to prevent datanode usage from being too high. > Currently, balancer almost randomly picks nodes as sources regardless of > usage, which makes it slow to bring down the top used datanodes in the > cluster, when there are less underutilized nodes in the cluster (consider > expansion). > We can add an option to prefer top used nodes first in each iteration, as > suggested in HDFS-14894 . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242800#comment-17242800 ] Sixiang Ma commented on HDFS-15705: --- [~weichiu] Thanks a lot! > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Assignee: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242795#comment-17242795 ] Wei-Chiu Chuang commented on HDFS-15705: +1. I added you to the contributor list and you'll be able to assign jira to yourself. Thanks for your contribution. Feel free to raise a PR too. > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Assignee: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-15705: -- Assignee: Sixiang Ma > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Assignee: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242753#comment-17242753 ] Sixiang Ma edited comment on HDFS-15705 at 12/2/20, 10:31 PM: -- Hi, [~weichiu], please take a look at this typo issue and my quick patch (against the latest trunk). No tests are needed since the typo is in the comment. Thank you! was (Author: starthinking): Hi, [~weichiu], please take a look at this typo issue in comment and my quick patch (against the latest trunk). No tests are needed since the typo is in the comment. Thank you! > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242753#comment-17242753 ] Sixiang Ma edited comment on HDFS-15705 at 12/2/20, 10:30 PM: -- Hi, [~weichiu], please take a look at this typo issue in comment and my quick patch (against the latest trunk). No tests are needed since the typo is in the comment. Thank you! was (Author: starthinking): no tests since the typo is in the comment > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242753#comment-17242753 ] Sixiang Ma edited comment on HDFS-15705 at 12/2/20, 10:25 PM: -- no tests since the typo is in the comment was (Author: starthinking): Hi [~weichiu], can you please take a look at this trivial bug (and my patch against trunk)? I think we don't need to run unit test suite for validate it, right? Thank you! > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242753#comment-17242753 ] Sixiang Ma commented on HDFS-15705: --- Hi [~weichiu], can you please take a look at this trivial bug (and my patch against trunk)? I think we don't need to run unit test suite for validate it, right? Thank you! > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15704) Mitigate lease monitor's rapid infinite loop
[ https://issues.apache.org/jira/browse/HDFS-15704?focusedWorklogId=519262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-519262 ] ASF GitHub Bot logged work on HDFS-15704: - Author: ASF GitHub Bot Created on: 02/Dec/20 21:41 Start Date: 02/Dec/20 21:41 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2511: URL: https://github.com/apache/hadoop/pull/2511#issuecomment-737513238 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 46s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 2s | | trunk passed | | +1 :green_heart: | compile | 1m 39s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | compile | 1m 31s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 56s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 35s | | trunk passed | | -1 :x: | shadedclient | 8m 15s | | branch has errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 28s | [/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. | | -1 :x: | javadoc | 0m 29s | [/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. | | +0 :ok: | spotbugs | 9m 44s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 :x: | findbugs | 0m 29s | [/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in trunk failed. | _ Patch Compile Tests _ | | -1 :x: | mvninstall | 0m 23s | [/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch failed. | | -1 :x: | compile | 0m 28s | [/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. | | -1 :x: | javac | 0m 28s | [/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. | | -1 :x: | compile | 0m 14s | [/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. | | -1 :x: | javac | 0m 14s | [/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2511/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. | | +1 :green_heart: | checkstyle | 0m 49s | | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0
[jira] [Work logged] (HDFS-15703) Don't generate edits for set operations that are no-op
[ https://issues.apache.org/jira/browse/HDFS-15703?focusedWorklogId=519261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-519261 ] ASF GitHub Bot logged work on HDFS-15703: - Author: ASF GitHub Bot Created on: 02/Dec/20 21:38 Start Date: 02/Dec/20 21:38 Worklog Time Spent: 10m Work Description: jbrennan333 merged pull request #2508: URL: https://github.com/apache/hadoop/pull/2508 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 519261) Time Spent: 40m (was: 0.5h) > Don't generate edits for set operations that are no-op > -- > > Key: HDFS-15703 > URL: https://issues.apache.org/jira/browse/HDFS-15703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > [~daryn] reported that setting the owner, group, or permissions to what it > already is will generate an unnecessary edit. It should not do this to avoid > performance issues when users unnecessarily run routine jobs to change the > group and permissions of a project tree. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15695) NN should not let the balancer run in safemode
[ https://issues.apache.org/jira/browse/HDFS-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan resolved HDFS-15695. Fix Version/s: 3.2.3 3.1.5 3.4.0 3.3.1 Resolution: Fixed Thanks [~daryn] and [~ahussein]! I have committed this to trunk, branch-3.3, branch-3.2, and branch-3.1. > NN should not let the balancer run in safemode > -- > > Key: HDFS-15695 > URL: https://issues.apache.org/jira/browse/HDFS-15695 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Time Spent: 50m > Remaining Estimate: 0h > > [~daryn] reported that when the balancer moves a block, the target DN block > reports the new location and hints to invalidate the source DN. The NN will > not issue invalidations in safemode, so every moved block appears to be in > excess. The data structures bloat and greatly increase the chance of a full > GC. > The NN should refuse to provide block locations to the balancer while in > safemode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242747#comment-17242747 ] Uma Maheswara Rao G commented on HDFS-15240: +1 latest patch looks good to me. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} >
[jira] [Commented] (HDFS-15703) Don't generate edits for set operations that are no-op
[ https://issues.apache.org/jira/browse/HDFS-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242738#comment-17242738 ] Jim Brennan commented on HDFS-15703: Thanks for putting this up [~ahussein]! We have been running with this in production for a few years now. Code and test look good to me. +1 > Don't generate edits for set operations that are no-op > -- > > Key: HDFS-15703 > URL: https://issues.apache.org/jira/browse/HDFS-15703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > [~daryn] reported that setting the owner, group, or permissions to what it > already is will generate an unnecessary edit. It should not do this to avoid > performance issues when users unnecessarily run routine jobs to change the > group and permissions of a project tree. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15695) NN should not let the balancer run in safemode
[ https://issues.apache.org/jira/browse/HDFS-15695?focusedWorklogId=519214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-519214 ] ASF GitHub Bot logged work on HDFS-15695: - Author: ASF GitHub Bot Created on: 02/Dec/20 19:59 Start Date: 02/Dec/20 19:59 Worklog Time Spent: 10m Work Description: jbrennan333 merged pull request #2489: URL: https://github.com/apache/hadoop/pull/2489 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 519214) Time Spent: 50m (was: 40m) > NN should not let the balancer run in safemode > -- > > Key: HDFS-15695 > URL: https://issues.apache.org/jira/browse/HDFS-15695 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > [~daryn] reported that when the balancer moves a block, the target DN block > reports the new location and hints to invalidate the source DN. The NN will > not issue invalidations in safemode, so every moved block appears to be in > excess. The data structures bloat and greatly increase the chance of a full > GC. > The NN should refuse to provide block locations to the balancer while in > safemode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242648#comment-17242648 ] Hadoop QA commented on HDFS-15240: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 18s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 37s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 2s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 41s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 41s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 18s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 27s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 3s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 50s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 50s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 51s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 51s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 41s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 50s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 54s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | |
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242388#comment-17242388 ] HuangTao commented on HDFS-15240: - Thanks [~tasanuma] for your careful review. I address all your comments in [^HDFS-15240.013.patch] . > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.013.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output
[jira] [Resolved] (HDFS-15670) Testcase TestBalancer#testBalancerWithPinnedBlocks always fails
[ https://issues.apache.org/jira/browse/HDFS-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-15670. - Resolution: Cannot Reproduce I'm closing this as not reproducible. I guess it should be an environmental issue. Feel free to reopen if you have updates. > Testcase TestBalancer#testBalancerWithPinnedBlocks always fails > --- > > Key: HDFS-15670 > URL: https://issues.apache.org/jira/browse/HDFS-15670 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Priority: Major > Attachments: HADOOP-15108.000.patch > > > When running testcases without any code changes, the function > testBalancerWithPinnedBlocks in TestBalancer.java never succeeded. I tried to > use Ubuntu 16.04 and redhat 7, maybe the failure is not related to various > linux environment. I am not sure if there is some bug in this case or I used > wrong environment and settings. Could anyone give some advice. > --- > Test set: org.apache.hadoop.hdfs.server.balancer.TestBalancer > --- > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 100.389 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer > testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) > Time elapsed: 100.134 sec <<< ERROR! > java.lang.Exception: test timed out after 10 milliseconds > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:903) > at > org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:773) > at > org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:870) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:441) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:515) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242355#comment-17242355 ] Takanobu Asanuma commented on HDFS-15240: - This bug seems to have been since Hadoop-3.0.0 was released. We saw a similar issue a long time ago, but we were unable to find its cause. Thank you very much for finding and fixing it, [~marvelrock]. Thanks for your reviews, [~ferhui] and [~umamaheswararao]. The main fix of [^HDFS-15240.012.patch] looks good to me. Some minor comments for the unit tests: * testTimeoutReadBlockInReconstruction: Please use JIRA number for the comment. {code:java} - // before this fix, NPE will cause reconstruction fail(test timeout) + // before HDFS-15240, NPE will cause reconstruction fail(test timeout) {code} * assertBufferPoolIsEmpty: This line could be removed? {code:java} - byteBuffer = null; {code} * emptyBufferPool: Calling {{getBuffer}} may just be enough? {code:java} - ByteBuffer byteBuffer = bufferPool.getBuffer(direct, 0); - byteBuffer = null; + bufferPool.getBuffer(direct, 0); {code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at >
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242287#comment-17242287 ] Hadoop QA commented on HDFS-15705: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 11s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 8s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 50s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 23s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 20s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 20s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 33s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | |
[jira] [Commented] (HDFS-13831) Make block increment deletion number configurable
[ https://issues.apache.org/jira/browse/HDFS-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242227#comment-17242227 ] GeoffreyStark commented on HDFS-13831: -- Can I understand that the purpose of lowering "dfs.namenode.block.deletion.increment" parameter is to make the loop of "blockManager.removeBlock(iter.next())" process faster each time, hold the lock for a short time, and others obejct obtain the lock more easily even though the lock contention frequency is high? > Make block increment deletion number configurable > - > > Key: HDFS-13831 > URL: https://issues.apache.org/jira/browse/HDFS-13831 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: Yiqun Lin >Assignee: Ryan Wu >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2 > > Attachments: HDFS-13831.001.patch, HDFS-13831.002.patch, > HDFS-13831.003.patch, HDFS-13831.004.patch, HDFS-13831.branch-3.0.001.patch > > > When NN deletes a large directory, it will hold the write lock long time. For > improving this, we remove the blocks in a batch way. So that other waiters > have a chance to get the lock. But right now, the batch number is a > hard-coded value. > {code} > static int BLOCK_DELETION_INCREMENT = 1000; > {code} > We can make this value configurable, so that we can control the frequency of > other waiters to get the lock chance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242191#comment-17242191 ] Sixiang Ma commented on HDFS-15705: --- Do I need to be assigned as 'Assignee' before making a GitHub pull request? Btw, it's my first issue report; how can I assign myself as 'Assignee'. Appreciate if someone can resolve my questions : ) > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sixiang Ma updated HDFS-15705: -- Affects Version/s: 3.4.0 > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242171#comment-17242171 ] huangtianhua commented on HDFS-15660: - [~liuml07] Hi, would you please help to review this? Thanks. > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Attachments: HDFS-15660.002.patch, HDFS-15660.003.patch > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15705: Status: Patch Available (was: Open) > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Sixiang Ma >Priority: Minor > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15705: Priority: Trivial (was: Minor) > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Sixiang Ma >Priority: Trivial > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sixiang Ma updated HDFS-15705: -- Comment: was deleted (was: Btw, I just joined Jira one hour ago. Do you know how can I have the contributor permission on Jira.) > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Sixiang Ma >Priority: Minor > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15705) Fix a typo in SecondaryNameNode.java
[ https://issues.apache.org/jira/browse/HDFS-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242131#comment-17242131 ] Sixiang Ma commented on HDFS-15705: --- Btw, I just joined Jira one hour ago. Do you know how can I have the contributor permission on Jira. > Fix a typo in SecondaryNameNode.java > > > Key: HDFS-15705 > URL: https://issues.apache.org/jira/browse/HDFS-15705 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Sixiang Ma >Priority: Minor > Attachments: HDFS-15705.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The comment above the function of 'public void shutdown()' in > SecondaryNameNode.java is incorrect. > It should be 'Shut down this instance of the secondary name.', instead of > 'datanode'. > This typo exists in the truck of Hadoop, so I generated the patch against the > truck. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org