[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers
[ https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984221#comment-15984221 ] Mukul Kumar Singh commented on HDFS-11580: -- Thanks [~linyiqun] for the updated patch. In the latest patch, Trace ID is being used to match a response to its corresponding request. {code} if (request.getTraceID().equals(curResponse.getTraceID())) { response = curResponse; } else { pendingResponses.put(curResponse.getTraceID(), curResponse); // Try to get response from pending responses map and remove the // response in map. response = pendingResponses.remove(request.getTraceID()); } {code} However, with Cblock, there are cases where TraceID is not set correctly. I feel that TraceID should not be used to match a response to its corresponding request. Would a counter be a better parameter to use here ? {code} BlockWriterTask --- ContainerProtocolCalls.writeSmallFile(client, containerName, Long.toString(block.getBlockID()), data, ""); {code} > Ozone: Support asynchronus client API for SCM and containers > > > Key: HDFS-11580 > URL: https://issues.apache.org/jira/browse/HDFS-11580 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Yiqun Lin > Attachments: HDFS-11580-HDFS-7240.001.patch, > HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, > HDFS-11580-HDFS-7240.004.patch > > > This is an umbrella JIRA that needs to support a set of APIs in Asynchronous > form. > For containers -- or the datanode API currently supports a call > {{sendCommand}}. we need to build proper programming interface and support an > async interface. > There is also a set of SCM API that clients can call, it would be nice to > support Async interface for those too. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-5042) Completed files lost after power failure
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984209#comment-15984209 ] Surendra Singh Lilhore commented on HDFS-5042: -- We also faced the same problem. Can we recover this kind of block from namenode after getting block report? If reported block genstamp and size is matching with the namenode in memory metadata then NameNode can send a command to datanode to recover from wrong replica state. > Completed files lost after power failure > > > Key: HDFS-5042 > URL: https://issues.apache.org/jira/browse/HDFS-5042 > Project: Hadoop HDFS > Issue Type: Bug > Environment: ext3 on CentOS 5.7 (kernel 2.6.18-274.el5) >Reporter: Dave Latham >Priority: Critical > > We suffered a cluster wide power failure after which HDFS lost data that it > had acknowledged as closed and complete. > The client was HBase which compacted a set of HFiles into a new HFile, then > after closing the file successfully, deleted the previous versions of the > file. The cluster then lost power, and when brought back up the newly > created file was marked CORRUPT. > Based on reading the logs it looks like the replicas were created by the > DataNodes in the 'blocksBeingWritten' directory. Then when the file was > closed they were moved to the 'current' directory. After the power cycle > those replicas were again in the blocksBeingWritten directory of the > underlying file system (ext3). When those DataNodes reported in to the > NameNode it deleted those replicas and lost the file. > Some possible fixes could be having the DataNode fsync the directory(s) after > moving the block from blocksBeingWritten to current to ensure the rename is > durable or having the NameNode accept replicas from blocksBeingWritten under > certain circumstances. > Log snippets from RS (RegionServer), NN (NameNode), DN (DataNode): > {noformat} > RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils: > Creating > file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c > with permission=rwxrwxrwx > NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.allocateBlock: > /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c. > blk_1395839728632046111_357084589 > DN 2013-06-29 11:16:06,832 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: > /10.0.5.237:50010 > NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to > blk_1395839728632046111_357084589 size 25418340 > NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to > blk_1395839728632046111_357084589 size 25418340 > NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to > blk_1395839728632046111_357084589 size 25418340 > DN 2013-06-29 11:16:11,385 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Received block > blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327 > DN 2013-06-29 11:16:11,385 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block > blk_1395839728632046111_357084589 terminating > NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: Removing > lease on file > /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c > from client DFSClient_hb_rs_hs745,60020,1372470111932 > NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.completeFile: file > /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c > is closed by DFSClient_hb_rs_hs745,60020,1372470111932 > RS 2013-06-29 11:16:11,393 INFO org.apache.hadoop.hbase.regionserver.Store: > Renaming compacted file at > hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c > to > hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c > RS 2013-06-29 11:16:11,505 INFO org.apache.hadoop.hbase.regionserver.Store: > Completed major compaction of 7 file(s) in n of > users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into > 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; total size for store is 24.2m > --- CRASH, RESTART - > NN 2013-06-29 12:01:19,743 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.addStoredBlock: addStoredBlock request received for >
[jira] [Commented] (HDFS-9962) Erasure Coding: need a way to test multiple EC policies
[ https://issues.apache.org/jira/browse/HDFS-9962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984187#comment-15984187 ] Takanobu Asanuma commented on HDFS-9962: Hi [~Sammi], yes I still plan to do. Sorry for late. > Erasure Coding: need a way to test multiple EC policies > --- > > Key: HDFS-9962 > URL: https://issues.apache.org/jira/browse/HDFS-9962 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rui Li >Assignee: Takanobu Asanuma > Labels: hdfs-ec-3.0-nice-to-have > > Now that we support multiple EC policies, we need a way test it to catch > potential issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11704) OzoneFileSystem: A Hadoop file system implementation for Ozone
Mingliang Liu created HDFS-11704: Summary: OzoneFileSystem: A Hadoop file system implementation for Ozone Key: HDFS-11704 URL: https://issues.apache.org/jira/browse/HDFS-11704 Project: Hadoop HDFS Issue Type: Sub-task Components: fs/ozone Reporter: Mingliang Liu Assignee: Mingliang Liu -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11703: -- Status: Patch Available (was: Open) > [READ] Tests for ProvidedStorageMap > --- > > Key: HDFS-11703 > URL: https://issues.apache.org/jira/browse/HDFS-11703 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11703-HDFS-9806.001.patch, > HDFS-11703-HDFS-9806.002.patch > > > Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11703: -- Attachment: HDFS-11703-HDFS-9806.002.patch Posting an updated patch with checkstyle errors fixed. > [READ] Tests for ProvidedStorageMap > --- > > Key: HDFS-11703 > URL: https://issues.apache.org/jira/browse/HDFS-11703 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11703-HDFS-9806.001.patch, > HDFS-11703-HDFS-9806.002.patch > > > Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11703: -- Status: Open (was: Patch Available) > [READ] Tests for ProvidedStorageMap > --- > > Key: HDFS-11703 > URL: https://issues.apache.org/jira/browse/HDFS-11703 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11703-HDFS-9806.001.patch > > > Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
[ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984129#comment-15984129 ] Hadoop QA commented on HDFS-11384: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 280 unchanged - 2 fixed = 284 total (was 282) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11384 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12865051/HDFS-11384.009.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux beb7a0d3cf1c 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4ea2778 | | Default Java | 1.8.0_121 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19201/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console
[jira] [Updated] (HDFS-10675) [READ] Datanode support to read from external stores.
[ https://issues.apache.org/jira/browse/HDFS-10675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-10675: -- Summary: [READ] Datanode support to read from external stores. (was: [HDFS-9806][READ] Datanode support to read from external stores.) > [READ] Datanode support to read from external stores. > - > > Key: HDFS-10675 > URL: https://issues.apache.org/jira/browse/HDFS-10675 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-10675-HDFS-9806.001.patch, > HDFS-10675-HDFS-9806.002.patch, HDFS-10675-HDFS-9806.003.patch, > HDFS-10675-HDFS-9806.004.patch, HDFS-10675-HDFS-9806.005.patch, > HDFS-10675-HDFS-9806.006.patch, HDFS-10675-HDFS-9806.007.patch, > HDFS-10675-HDFS-9806.008.patch, HDFS-10675-HDFS-9806.009.patch > > > This JIRA introduces a new {{PROVIDED}} {{StorageType}} to represent external > stores, along with enabling the Datanode to read from such stores using a > {{ProvidedReplica}} and a {{ProvidedVolume}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10675) [HDFS-9806][READ] Datanode support to read from external stores.
[ https://issues.apache.org/jira/browse/HDFS-10675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-10675: -- Summary: [HDFS-9806][READ] Datanode support to read from external stores. (was: [READ] Datanode support to read from external stores.) > [HDFS-9806][READ] Datanode support to read from external stores. > > > Key: HDFS-10675 > URL: https://issues.apache.org/jira/browse/HDFS-10675 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-10675-HDFS-9806.001.patch, > HDFS-10675-HDFS-9806.002.patch, HDFS-10675-HDFS-9806.003.patch, > HDFS-10675-HDFS-9806.004.patch, HDFS-10675-HDFS-9806.005.patch, > HDFS-10675-HDFS-9806.006.patch, HDFS-10675-HDFS-9806.007.patch, > HDFS-10675-HDFS-9806.008.patch, HDFS-10675-HDFS-9806.009.patch > > > This JIRA introduces a new {{PROVIDED}} {{StorageType}} to represent external > stores, along with enabling the Datanode to read from such stores using a > {{ProvidedReplica}} and a {{ProvidedVolume}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies
[ https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984106#comment-15984106 ] Huafeng Wang commented on HDFS-11605: - Hi Kai, there is one check style issue left. It's the method length check in FSNameSystem but it's not introduced by my patch. Maybe we can just wait for the QA result. > Allow user to customize and add new erasure code codecs and policies > > > Key: HDFS-11605 > URL: https://issues.apache.org/jira/browse/HDFS-11605 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Huafeng Wang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, > HDFS-11605.003.patch, HDFS-11605.004.patch > > > Based on the facility developed in HDFS-11604, this will develop necessary > CLI cmd to load an XML file and the results will be maintained in NameNode > {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with > {{SYS_POLICIES}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies
[ https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984101#comment-15984101 ] Kai Zheng commented on HDFS-11605: -- Thanks [~HuafengWang] for the update per off-line reviewing discussion. Did you also fix the check styles? > Allow user to customize and add new erasure code codecs and policies > > > Key: HDFS-11605 > URL: https://issues.apache.org/jira/browse/HDFS-11605 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Huafeng Wang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, > HDFS-11605.003.patch, HDFS-11605.004.patch > > > Based on the facility developed in HDFS-11604, this will develop necessary > CLI cmd to load an XML file and the results will be maintained in NameNode > {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with > {{SYS_POLICIES}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies
[ https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984088#comment-15984088 ] Huafeng Wang commented on HDFS-11605: - My latest patch covers following parts: 1. Add a new RPC call which adds user defined EC policies, and returns the results of each add operation. 2. Correspondingly add a new command in ECAdmin. 3. Change ErasureCodingPolicyManager to a singleton-like class. > Allow user to customize and add new erasure code codecs and policies > > > Key: HDFS-11605 > URL: https://issues.apache.org/jira/browse/HDFS-11605 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Huafeng Wang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, > HDFS-11605.003.patch, HDFS-11605.004.patch > > > Based on the facility developed in HDFS-11604, this will develop necessary > CLI cmd to load an XML file and the results will be maintained in NameNode > {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with > {{SYS_POLICIES}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies
[ https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huafeng Wang updated HDFS-11605: Attachment: HDFS-11605.004.patch > Allow user to customize and add new erasure code codecs and policies > > > Key: HDFS-11605 > URL: https://issues.apache.org/jira/browse/HDFS-11605 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Huafeng Wang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, > HDFS-11605.003.patch, HDFS-11605.004.patch > > > Based on the facility developed in HDFS-11604, this will develop necessary > CLI cmd to load an XML file and the results will be maintained in NameNode > {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with > {{SYS_POLICIES}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers
[ https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984042#comment-15984042 ] Hadoop QA commented on HDFS-11580: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 53s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 0s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} HDFS-7240 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-hdfs-project: The patch generated 7 new + 0 unchanged - 1 fixed = 7 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 8s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}121m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | Possible null pointer dereference of response in org.apache.hadoop.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtos$ContainerCommandResponseProto) Dereferenced at ContainerProtocolCalls.java:response in org.apache.hadoop.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtos$ContainerCommandResponseProto) Dereferenced at ContainerProtocolCalls.java:[line 650] | | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.server.namenode.TestStartup | | Timed out junit tests |
[jira] [Commented] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers
[ https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984021#comment-15984021 ] Hadoop QA commented on HDFS-11627: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 58s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}109m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSUpgradeFromImage | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612578f | | JIRA Issue | HDFS-11627 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12864888/HDFS-11627-HDFS-7240.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux fc112f0abbbf 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-7240 / eae8c2a | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19196/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19196/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19196/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Block Storage: Cblock cache should register with flusher to upload blocks to > containers > --- > > Key: HDFS-11627 > URL: https://issues.apache.org/jira/browse/HDFS-11627 > Project:
[jira] [Commented] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984013#comment-15984013 ] Hadoop QA commented on HDFS-11702: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 26s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 7s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11702 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12865017/HDFS-11702.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 1155131d187c 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 475f933 | | Default Java | 1.8.0_121 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19200/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html | | findbugs |
[jira] [Updated] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
[ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-11384: --- Attachment: HDFS-11384.009.patch Added waitActive() after starting DataNodes. > Add option for balancer to disperse getBlocks calls to avoid NameNode's > rpc.CallQueueLength spike > - > > Key: HDFS-11384 > URL: https://issues.apache.org/jira/browse/HDFS-11384 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.3 >Reporter: yunjiong zhao >Assignee: Konstantin Shvachko > Attachments: balancer.day.png, balancer.week.png, > HDFS-11384.001.patch, HDFS-11384.002.patch, HDFS-11384.003.patch, > HDFS-11384.004.patch, HDFS-11384.005.patch, HDFS-11384.006.patch, > HDFS-11384-007.patch, HDFS-11384.008.patch, HDFS-11384.009.patch > > > When running balancer on hadoop cluster which have more than 3000 Datanodes > will cause NameNode's rpc.CallQueueLength spike. We observed this situation > could cause Hbase cluster failure due to RegionServer's WAL timeout. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984006#comment-15984006 ] Hadoop QA commented on HDFS-11703: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 11s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} HDFS-9806 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 35s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-9806 has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 11 new + 0 unchanged - 0 fixed = 11 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11703 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12865020/HDFS-11703-HDFS-9806.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 460af8747539 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-9806 / 76a72ae | | Default Java | 1.8.0_121 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19198/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/patch-asflicense-problems.txt | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output |
[jira] [Commented] (HDFS-10631) Federation State Store ZooKeeper implementation
[ https://issues.apache.org/jira/browse/HDFS-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984000#comment-15984000 ] Hadoop QA commented on HDFS-10631: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 27s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 49s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Redundant nullcheck of recordsToRemove, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.remove(Class, Query) Redundant null check at StateStoreZooKeeperImpl.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.remove(Class, Query) Redundant null check at StateStoreZooKeeperImpl.java:[line 315] | | | Redundant nullcheck of znode, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.removeAll(Class) Redundant null check at StateStoreZooKeeperImpl.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.removeAll(Class) Redundant null check at StateStoreZooKeeperImpl.java:[line 344] | | | Call to java.util.Map.equals(String) in org.apache.hadoop.hdfs.server.federation.store.records.BaseRecord.like(BaseRecord) At BaseRecord.java: At BaseRecord.java:[line 145] | | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.server.balancer.TestBalancer | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Updated] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers
[ https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11627: - Status: Patch Available (was: Open) > Block Storage: Cblock cache should register with flusher to upload blocks to > containers > --- > > Key: HDFS-11627 > URL: https://issues.apache.org/jira/browse/HDFS-11627 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11627-HDFS-7240.001.patch, > HDFS-11627-HDFS-7240.002.patch > > > Cblock cache should register with flusher to upload blocks to containers. > Currently Container Cache flusher tries to write to the container even when > the CblockLocalCache pipelines are not registered with the flusher. > This will result in the Container writes to fail. > CblockLocalCache should register with the flusher before accepting any blocks > for write -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers
[ https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11627: - Status: Open (was: Patch Available) > Block Storage: Cblock cache should register with flusher to upload blocks to > containers > --- > > Key: HDFS-11627 > URL: https://issues.apache.org/jira/browse/HDFS-11627 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11627-HDFS-7240.001.patch, > HDFS-11627-HDFS-7240.002.patch > > > Cblock cache should register with flusher to upload blocks to containers. > Currently Container Cache flusher tries to write to the container even when > the CblockLocalCache pipelines are not registered with the flusher. > This will result in the Container writes to fail. > CblockLocalCache should register with the flusher before accepting any blocks > for write -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983855#comment-15983855 ] Hudson commented on HDFS-11691: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11628 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11628/]) HDFS-11691. Add a proper scheme to the datanode links in NN web UI. (jlowe: rev e4321ec84321672a714419278946fe1012daac71) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11703: -- Status: Patch Available (was: Open) > [READ] Tests for ProvidedStorageMap > --- > > Key: HDFS-11703 > URL: https://issues.apache.org/jira/browse/HDFS-11703 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11703-HDFS-9806.001.patch > > > Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap
[ https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11703: -- Attachment: HDFS-11703-HDFS-9806.001.patch > [READ] Tests for ProvidedStorageMap > --- > > Key: HDFS-11703 > URL: https://issues.apache.org/jira/browse/HDFS-11703 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11703-HDFS-9806.001.patch > > > Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11703) [READ] Tests for ProvidedStorageMap
Virajith Jalaparti created HDFS-11703: - Summary: [READ] Tests for ProvidedStorageMap Key: HDFS-11703 URL: https://issues.apache.org/jira/browse/HDFS-11703 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Virajith Jalaparti Add tests for the {{ProvidedStorageMap}} in the namenode -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HDFS-11691: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.1 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) Thanks to [~kihwal] for the contribution and to [~cheersyang] for additional review! I committed this to trunk, branch-2, branch-2.8, and branch-2.8.1. > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures
[ https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Moore updated HDFS-11701: --- Affects Version/s: (was: 2.7.0) 2.6.0 > NPE from Unresolved Host causes permanent DFSInputStream failures > - > > Key: HDFS-11701 > URL: https://issues.apache.org/jira/browse/HDFS-11701 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 > Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH > 5.9.0 >Reporter: James Moore > > We recently encountered the following NPE due to the DFSInputStream storing > old cached block locations from hosts which could no longer resolve. > {quote} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) > at > org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) > at > org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) > at > org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) > ~HBase related stack frames trimmed~ > {quote} > After investigating, the DFSInputStream appears to have been open for upwards > of 3-4 weeks and had cached block locations from decommissioned nodes that no > longer resolve in DNS and had been shutdown and removed from the cluster 2 > weeks prior. If the DFSInputStream had refreshed its block locations from > the name node, it would have received alternative block locations which would > not contain the decommissioned data nodes. As the above NPE leaves the > non-resolving data node in the list of block locations the DFSInputStream > never refreshes the block locations and all attempts to open a BlockReader > for the given blocks will fail. > In our case, we resolved the NPE by closing and re-opening every > DFSInputStream in the cluster to force a purge of the block locations cache. > Ideally, the DFSInputStream would re-fetch all block locations for a host > which can't be resolved in DNS or at least the blocks requested. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10480) Add an admin command to list currently open files
[ https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983738#comment-15983738 ] Rushabh S Shah commented on HDFS-10480: --- bq. Just curious, what's the difference of this command with hdfs fsck / -openforwrite ? The output will be the same. Just the time to reach that output is vastly different. :) hdfs fsck / will crawl the whole filesystem. > Add an admin command to list currently open files > - > > Key: HDFS-10480 > URL: https://issues.apache.org/jira/browse/HDFS-10480 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Rushabh S Shah > Attachments: HDFS-10480-trunk-1.patch, HDFS-10480-trunk.patch > > > Currently there is no easy way to obtain the list of active leases or files > being written. It will be nice if we have an admin command to list open files > and their lease holders. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-11702: -- Status: Patch Available (was: Open) > Remove indefinite caching of key provider uri in DFSClient > -- > > Key: HDFS-11702 > URL: https://issues.apache.org/jira/browse/HDFS-11702 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-11702.patch > > > There is an indefinite caching of key provider uri in dfsclient. > Relevant piece of code. > {code:title=DFSClient.java|borderStyle=solid} > /** >* The key provider uri is searched in the following order. >* 1. If there is a mapping in Credential's secrets map for namenode uri. >* 2. From namenode getServerDefaults rpc. >* 3. Finally fallback to local conf. >* @return keyProviderUri if found from either of above 3 cases, >* null otherwise >* @throws IOException >*/ > URI getKeyProviderUri() throws IOException { > if (keyProviderUri != null) { > return keyProviderUri; > } > // Lookup the secret in credentials object for namenodeuri. > Credentials credentials = ugi.getCredentials(); >... >... > {code} > Once the key provider uri is set, it won't refresh the value even if the key > provider uri on namenode is changed. > For long running clients like on oozie servers, this means we have to bounce > all the oozie servers to get the change reflected. > After this change, the client will cache the value for an hour after which it > will issue getServerDefaults call and will refresh the key provider uri. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-11702: -- Attachment: HDFS-11702.patch Attaching a simple patch. > Remove indefinite caching of key provider uri in DFSClient > -- > > Key: HDFS-11702 > URL: https://issues.apache.org/jira/browse/HDFS-11702 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-11702.patch > > > There is an indefinite caching of key provider uri in dfsclient. > Relevant piece of code. > {code:title=DFSClient.java|borderStyle=solid} > /** >* The key provider uri is searched in the following order. >* 1. If there is a mapping in Credential's secrets map for namenode uri. >* 2. From namenode getServerDefaults rpc. >* 3. Finally fallback to local conf. >* @return keyProviderUri if found from either of above 3 cases, >* null otherwise >* @throws IOException >*/ > URI getKeyProviderUri() throws IOException { > if (keyProviderUri != null) { > return keyProviderUri; > } > // Lookup the secret in credentials object for namenodeuri. > Credentials credentials = ugi.getCredentials(); >... >... > {code} > Once the key provider uri is set, it won't refresh the value even if the key > provider uri on namenode is changed. > For long running clients like on oozie servers, this means we have to bounce > all the oozie servers to get the change reflected. > After this change, the client will cache the value for an hour after which it > will issue getServerDefaults call and will refresh the key provider uri. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-11702: -- Description: There is an indefinite caching of key provider uri in dfsclient. Relevant piece of code. {code:title=DFSClient.java|borderStyle=solid} /** * The key provider uri is searched in the following order. * 1. If there is a mapping in Credential's secrets map for namenode uri. * 2. From namenode getServerDefaults rpc. * 3. Finally fallback to local conf. * @return keyProviderUri if found from either of above 3 cases, * null otherwise * @throws IOException */ URI getKeyProviderUri() throws IOException { if (keyProviderUri != null) { return keyProviderUri; } // Lookup the secret in credentials object for namenodeuri. Credentials credentials = ugi.getCredentials(); ... ... {code} Once the key provider uri is set, it won't refresh the value even if the key provider uri on namenode is changed. For long running clients like on oozie servers, this means we have to bounce all the oozie servers to get the change reflected. After this change, the client will cache the value for an hour after which it will issue getServerDefaults call and will refresh the key provider uri. was: There is an indefinite caching of key provider uri in dfsclient. Relevant piece of code. {code:title=DFSClient.java|borderStyle=solid} /** * The key provider uri is searched in the following order. * 1. If there is a mapping in Credential's secrets map for namenode uri. * 2. From namenode getServerDefaults rpc. * 3. Finally fallback to local conf. * @return keyProviderUri if found from either of above 3 cases, * null otherwise * @throws IOException */ URI getKeyProviderUri() throws IOException { if (keyProviderUri != null) { return keyProviderUri; } {code} Once the key provider uri is set, it won't refresh the value even if the key provider uri on namenode is changed. For long running clients like on oozie servers, this means we have to bounce all the oozie servers to get the change reflected. After this change, the client will cache the value for an hour after which it will issue getServerDefaults call and will refresh the key provider uri. > Remove indefinite caching of key provider uri in DFSClient > -- > > Key: HDFS-11702 > URL: https://issues.apache.org/jira/browse/HDFS-11702 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > > There is an indefinite caching of key provider uri in dfsclient. > Relevant piece of code. > {code:title=DFSClient.java|borderStyle=solid} > /** >* The key provider uri is searched in the following order. >* 1. If there is a mapping in Credential's secrets map for namenode uri. >* 2. From namenode getServerDefaults rpc. >* 3. Finally fallback to local conf. >* @return keyProviderUri if found from either of above 3 cases, >* null otherwise >* @throws IOException >*/ > URI getKeyProviderUri() throws IOException { > if (keyProviderUri != null) { > return keyProviderUri; > } > // Lookup the secret in credentials object for namenodeuri. > Credentials credentials = ugi.getCredentials(); >... >... > {code} > Once the key provider uri is set, it won't refresh the value even if the key > provider uri on namenode is changed. > For long running clients like on oozie servers, this means we have to bounce > all the oozie servers to get the change reflected. > After this change, the client will cache the value for an hour after which it > will issue getServerDefaults call and will refresh the key provider uri. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient
Rushabh S Shah created HDFS-11702: - Summary: Remove indefinite caching of key provider uri in DFSClient Key: HDFS-11702 URL: https://issues.apache.org/jira/browse/HDFS-11702 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Rushabh S Shah Assignee: Rushabh S Shah There is an indefinite caching of key provider uri in dfsclient. Relevant piece of code. {code:title=DFSClient.java|borderStyle=solid} /** * The key provider uri is searched in the following order. * 1. If there is a mapping in Credential's secrets map for namenode uri. * 2. From namenode getServerDefaults rpc. * 3. Finally fallback to local conf. * @return keyProviderUri if found from either of above 3 cases, * null otherwise * @throws IOException */ URI getKeyProviderUri() throws IOException { if (keyProviderUri != null) { return keyProviderUri; } {code} Once the key provider uri is set, it won't refresh the value even if the key provider uri on namenode is changed. For long running clients like on oozie servers, this means we have to bounce all the oozie servers to get the change reflected. After this change, the client will cache the value for an hour after which it will issue getServerDefaults call and will refresh the key provider uri. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests
[ https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983518#comment-15983518 ] Wei-Chiu Chuang commented on HDFS-10799: I got pinged about this patch recently a few times, so I gave it a deeper thought. I am convinced now that it's not entirely NameNode's responsibility. Even after applying this patch, the iNotify client still needs to ensure its Kerberos ticket is valid, otherwise the next iNotify request will fail. One of our internal projects bumped into this issue, and after adding code to renew tickets periodically, this issue went away. The only correct thing that NN should do, is to distinguish a inotify request v.s. a edit log flush request. For the former, it shouldn't print that scary message. > NameNode should use loginUser(hdfs) to serve iNotify requests > - > > Key: HDFS-10799 > URL: https://issues.apache.org/jira/browse/HDFS-10799 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.0 > Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10799.001.patch > > > When a NameNode serves iNotify requests from a client, it verifies the client > has superuser permission and then uses the client's Kerberos principal to > read edits from journal nodes. > However, if the client does not renew its tgt tickets, the connection from > NameNode to journal nodes may fail. In which case, the NameNode thinks the > edits are corrupt, and prints a scary error message: > "During automatic edit log failover, we noticed that all of the remaining > edit log streams are shorter than the current one! The best remaining edit > log ends at transaction 11577603, but we thought we could read up to > transaction 11577606. If you continue, metadata will be lost forever!" > However, the edits are actually good. NameNode _should not freak out when an > iNotify client's tgt ticket expires_. > I think that an easy solution to this bug, is that after NameNode verifies > client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then > read edits. This will make sure the operation does not fail due to an expired > client ticket. > Excerpt of related logs: > {noformat} > 2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:h...@example.com (auth:KERBEROS) > cause:java.io.IOException: We encountered an error reading > http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy, > > http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy. > During automatic edit log failover, we noticed that all of the remaining > edit log streams are shorter than the current one! The best remaining edit > log ends at transaction 11577603, but we thought we could read up to > transaction 11577606. If you continue, metadata will be lost forever! > 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 112 on 8020, call > org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client > IP:port] Call#73 Retry#0 > java.io.IOException: We encountered an error reading > http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy, > > http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy. > During automatic edit log failover, we noticed that all of the remaining > edit log streams are shorter than the current one! The best remaining edit > log ends at transaction 11577603, but we thought we could read up to > transaction 11577606. If you continue, metadata will be lost forever! > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at
[jira] [Commented] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers
[ https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983514#comment-15983514 ] Chen Liang commented on HDFS-11627: --- Thanks [~msingh] for the updates and the comments! v002 patch looks good to me, pending Jenkins. > Block Storage: Cblock cache should register with flusher to upload blocks to > containers > --- > > Key: HDFS-11627 > URL: https://issues.apache.org/jira/browse/HDFS-11627 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11627-HDFS-7240.001.patch, > HDFS-11627-HDFS-7240.002.patch > > > Cblock cache should register with flusher to upload blocks to containers. > Currently Container Cache flusher tries to write to the container even when > the CblockLocalCache pipelines are not registered with the flusher. > This will result in the Container writes to fail. > CblockLocalCache should register with the flusher before accepting any blocks > for write -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers
[ https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983505#comment-15983505 ] Chen Liang commented on HDFS-11580: --- Thanks [~linyiqun] for updating the patch! in {{XceiverClientHandler.java}} 1. I'm not sure how {{CompletableFuture}} works here, so this is more like a question and everyone's thoughts are welcome: In {{sendCommandAsync}}, will {{supplyAsync}} be called by multiple threads at the same time? how about {{waitForResponse}}? If so, is this method thread-safe? (e.g. do we need protection for {{pendingResponses}}?) in {{ContainerProtocolCalls.java}} 2. Actually I think [~msingh] brought up a critical point, which is that we need to be certain that the responses match the requests. This can be somewhat tricky, but can we add a unit test to verify this? I'm thinking of may be having a test that sends a number of async requests and see if they all got properly responded. 3. Make this one line? {code} ContainerCommandResponseProto response; response = xceiverClient.sendCommand(request); {code} in {{XceiverClientRatis.java}} 4. {{// TODO: Implement the async interface.}} Let's either file another JIRA to follow up, or throw an {{UnsupportedOperationException}} if we will not support async calls with Ratis. [~anu]? > Ozone: Support asynchronus client API for SCM and containers > > > Key: HDFS-11580 > URL: https://issues.apache.org/jira/browse/HDFS-11580 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Yiqun Lin > Attachments: HDFS-11580-HDFS-7240.001.patch, > HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, > HDFS-11580-HDFS-7240.004.patch > > > This is an umbrella JIRA that needs to support a set of APIs in Asynchronous > form. > For containers -- or the datanode API currently supports a call > {{sendCommand}}. we need to build proper programming interface and support an > async interface. > There is also a set of SCM API that clients can call, it would be nice to > support Async interface for those too. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10631) Federation State Store ZooKeeper implementation
[ https://issues.apache.org/jira/browse/HDFS-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Inigo Goiri updated HDFS-10631: --- Attachment: HDFS-10631-HDFS-10467-003.patch * Updating to {{Logger}} * Switching to {{Query}} > Federation State Store ZooKeeper implementation > --- > > Key: HDFS-10631 > URL: https://issues.apache.org/jira/browse/HDFS-10631 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Inigo Goiri >Assignee: Jason Kace > Attachments: HDFS-10631-HDFS-10467-001.patch, > HDFS-10631-HDFS-10467-002.patch, HDFS-10631-HDFS-10467-003.patch > > > State Store implementation using ZooKeeper. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983422#comment-15983422 ] Inigo Goiri commented on HDFS-10467: [~fabbri], the tasks in the current JIRA are the basic ones to get the Router-based federation working. There are a bunch of them that we can add: * Web interface * Metrics system * Router heartbeating * Router safe mode * Rebalancing All these are already implemented and is running in our clusters. There is a couple months ago version available at: https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router (I can update with the latest if needed.) At this point is a matter of reviewing the code in the subtasks. It's hard to give a time frame but having reviews; so any reviews on the subtasks is highly appreciated. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS > Router Federation.pdf, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983390#comment-15983390 ] Aaron Fabbri commented on HDFS-10467: - Thanks for the update [~elgoiri]. I'm trying to get a feel for the overall progress of this. Are there any work items that are not already covered in the subtasks here? Any other details on how much work is left, or when you expect to have basic features completed, is welcomed. > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.1 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS > Router Federation.pdf, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-7541) Upgrade Domains in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-7541: Assignee: Kihwal Lee (was: Ming Ma) > Upgrade Domains in HDFS > --- > > Key: HDFS-7541 > URL: https://issues.apache.org/jira/browse/HDFS-7541 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Kihwal Lee > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-7541-2.patch, HDFS-7541.patch, > SupportforfastHDFSdatanoderollingupgrade.pdf, UpgradeDomains_design_v2.pdf, > UpgradeDomains_Design_v3.pdf > > > Current HDFS DN rolling upgrade step requires sequential DN restart to > minimize the impact on data availability and read/write operations. The side > effect is longer upgrade duration for large clusters. This might be > acceptable for DN JVM quick restart to update hadoop code/configuration. > However, for OS upgrade that requires machine reboot, the overall upgrade > duration will be too long if we continue to do sequential DN rolling restart. > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11700) testBackupNodePorts doesn't pass on Windows machine
[ https://issues.apache.org/jira/browse/HDFS-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983266#comment-15983266 ] Anbang Hu commented on HDFS-11700: -- More detailed investigation shows that after *Attempt 1* fails, in forcing {{rpcServer}} to stop, {{listener.doStop()}} in {{Server.stop()}} fails to close the socket properly. Tracing the code inside {{doStop}}: {{acceptChannel.socket()}} returns a {{ServerSocketAdaptor}} -> {{ServerSocketAdaptor.close()}} -> {{ServerSocketChannelmpl.close()}} -> {{AbstractInterruptibleChannel.close()}} -> {{AbstractSelectableChannel.implCloseChannel()}} -> {{ServerSocketChannelImpl.implCloseSelectableChannel()}} -> {{SocketDispatcher.preClose()}} -> {{preClose0(FileDescriptor var0)}} The difference on Windows and Ubuntu is that after {{preClose0(FileDescriptor var0)}}, the previously used port becomes available on Ubuntu, but not on Windows. {{preClose0}} is native method and I am using Oracle Java 1.8.0_121. > testBackupNodePorts doesn't pass on Windows machine > --- > > Key: HDFS-11700 > URL: https://issues.apache.org/jira/browse/HDFS-11700 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Windows 10 >Reporter: Anbang Hu > > In TestHDFSServerPorts.testBackupNodePorts, there are two attempts at > starting backup node. > *Attempt 1*: > 1) It binds namenode backup address with 0: > {quote} > backup_config.set(DFSConfigKeys.DFS_NAMENODE_BACKUP_ADDRESS_KEY, THIS_HOST); > {quote} > 2) It sets namenode backup address with an available port X. > 3) It fails rightfully due to using the same http address as active namenode. > *Attempt 2*: > 1) It tries to reuse port X as namenode backup address. > 2) It fails to bind to X because Windows does not release port X properly > after *Attempt 1* fails. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10788) fsck NullPointerException when it encounters corrupt replicas
[ https://issues.apache.org/jira/browse/HDFS-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983264#comment-15983264 ] Wei-Chiu Chuang commented on HDFS-10788: We are receiving multiple reports that users on CDH versions above CDH5.5.2 are still experiencing the same issue. It is not clear at this point if Apache Hadoop also carries the same bug, but I thought I should share this information. > fsck NullPointerException when it encounters corrupt replicas > - > > Key: HDFS-10788 > URL: https://issues.apache.org/jira/browse/HDFS-10788 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.0 > Environment: CDH5.5.2, CentOS 6.7 >Reporter: Jeff Field > > Somehow (I haven't found root cause yet) we ended up with blocks that have > corrupt replicas where the replica count is inconsistent between the blockmap > and the corrupt replicas map. If we try to hdfs fsck any parent directory > that has a child with one of these blocks, fsck will exit with something like > this: > {code} > $ hdfs fsck /path/to/parent/dir/ | egrep -v '^\.+$' > Connecting to namenode via http://mynamenode:50070 > FSCK started by bot-hadoop (auth:KERBEROS_SSL) from /10.97.132.43 for path > /path/to/parent/dir/ at Tue Aug 23 20:34:58 UTC 2016 > .FSCK > ended at Tue Aug 23 20:34:59 UTC 2016 in 1098 milliseconds > null > Fsck on path '/path/to/parent/dir/' FAILED > {code} > So I start at the top, fscking every subdirectory until I find one or more > that fails. Then I do the same thing with those directories (our top level > directories all have subdirectories with date directories in them, which then > contain the files) and once I find a directory with files in it, I run a > checksum of the files in that directory. When I do that, I don't get the name > of the file, rather I get: > checksum: java.lang.NullPointerException > but since the files are in order, I can figure it out by seeing which file > was before the NPE. Once I get to this point, I can see the following in the > namenode log when I try to checksum the corrupt file: > 2016-08-23 20:24:59,627 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Inconsistent > number of corrupt replicas for blk_1335893388_1100036319546 blockMap has 0 > but corrupt replicas map has 1 > 2016-08-23 20:24:59,627 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 23 on 8020, call > org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from > 192.168.1.100:47785 Call#1 Retry#0 > java.lang.NullPointerException > At which point I can delete the file, but it is a very tedious process. > Ideally, shouldn't fsck be able to emit the name of the file that is the > source of the problem - and (if -delete is specified) get rid of the file, > instead of exiting without saying why? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures
[ https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Moore updated HDFS-11701: --- Description: We recently encountered the following NPE due to the DFSInputStream storing old cached block locations from hosts which could no longer resolve. {quote} Caused by: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) at org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) at org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) at org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) at org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) ~HBase related stack frames trimmed~ {quote} After investigating, the DFSInputStream appears to have been open for upwards of 3-4 weeks and had cached block locations from decommissioned nodes that no longer resolve in DNS and had been shutdown and removed from the cluster 2 weeks prior. If the DFSInputStream had refreshed its block locations from the name node, it would have received alternative block locations which would not contain the decommissioned data nodes. As the above NPE leaves the non-resolving data node in the list of block locations the DFSInputStream never refreshes the block locations and all attempts to open a BlockReader for the given blocks will fail. In our case, we resolved the NPE by closing and re-opening every DFSInputStream in the cluster to force a purge of the block locations cache. Ideally, the DFSInputStream would re-fetch all block locations for a host which can't be resolved in DNS or at least the blocks requested. was: We recently encountered the following NPE due to the DFSInputStream storing old cached block locations from hosts which could no longer resolve. ```Caused by: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) at org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) at org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) at org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) at org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) ~HBase related stack frames trimmed~``` After investigating, the DFSInputStream appears to have been open for upwards of 3-4 weeks and had cached block locations from decommissioned nodes that no longer resolve in DNS and had been shutdown and removed from the cluster 2 weeks prior. If the DFSInputStream had refreshed its block locations from the name node, it would have received alternative block locations which would not contain the decommissioned data nodes. As the above NPE leaves the non-resolving data node in the list of block locations the DFSInputStream never refreshes the block locations and all attempts to open a BlockReader for the given blocks will fail. In our case, we resolved the NPE by closing and re-opening every DFSInputStream in the cluster to force a purge of the block locations cache. Ideally, the DFSInputStream would re-fetch all block locations for a host which can't be resolved in DNS or at least the blocks requested. > NPE from Unresolved Host causes permanent DFSInputStream failures > - > > Key: HDFS-11701 > URL: https://issues.apache.org/jira/browse/HDFS-11701 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 > Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH > 5.9.0 >Reporter: James Moore > > We recently encountered the following NPE due to the DFSInputStream storing > old cached block locations from hosts which could no longer resolve. > {quote} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) > at > org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) > at >
[jira] [Created] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures
James Moore created HDFS-11701: -- Summary: NPE from Unresolved Host causes permanent DFSInputStream failures Key: HDFS-11701 URL: https://issues.apache.org/jira/browse/HDFS-11701 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.0 Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 5.9.0 Reporter: James Moore We recently encountered the following NPE due to the DFSInputStream storing old cached block locations from hosts which could no longer resolve. ```Caused by: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) at org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) at org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) at org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) at org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) ~HBase related stack frames trimmed~``` After investigating, the DFSInputStream appears to have been open for upwards of 3-4 weeks and had cached block locations from decommissioned nodes that no longer resolve in DNS and had been shutdown and removed from the cluster 2 weeks prior. If the DFSInputStream had refreshed its block locations from the name node, it would have received alternative block locations which would not contain the decommissioned data nodes. As the above NPE leaves the non-resolving data node in the list of block locations the DFSInputStream never refreshes the block locations and all attempts to open a BlockReader for the given blocks will fail. In our case, we resolved the NPE by closing and re-opening every DFSInputStream in the cluster to force a purge of the block locations cache. Ideally, the DFSInputStream would re-fetch all block locations for a host which can't be resolved in DNS or at least the blocks requested. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11697) Javadoc of erasure coding policy in file status
[ https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983063#comment-15983063 ] Takanobu Asanuma commented on HDFS-11697: - +1(non-binding), pending Jenkins. Thanks for the patch! > Javadoc of erasure coding policy in file status > --- > > Key: HDFS-11697 > URL: https://issues.apache.org/jira/browse/HDFS-11697 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HDFS-11697.01.patch, HDFS-11697.02.patch > > > Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in > javadoc explicitly as well as storage policy. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983034#comment-15983034 ] Jason Lowe commented on HDFS-11691: --- +1 lgtm. I'll commit this later today if there are no objections. > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11697) Javadoc of erasure coding policy in file status
[ https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-11697: -- Attachment: HDFS-11697.02.patch [~tasanuma0829] Thanks for checking!. I updated accordingly. > Javadoc of erasure coding policy in file status > --- > > Key: HDFS-11697 > URL: https://issues.apache.org/jira/browse/HDFS-11697 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HDFS-11697.01.patch, HDFS-11697.02.patch > > > Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in > javadoc explicitly as well as storage policy. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9803) Proactively refresh ShortCircuitCache entries to avoid latency spikes
[ https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982987#comment-15982987 ] Parag Darji commented on HDFS-9803: --- I'm facing the same issue and seeing slowness in HBase performance. Has anyone experienced slowness in HBase? For now I'm restarting HDFS every three weeks which seems to help a bit. > Proactively refresh ShortCircuitCache entries to avoid latency spikes > - > > Key: HDFS-9803 > URL: https://issues.apache.org/jira/browse/HDFS-9803 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Nick Dimiduk > > My region server logs are flooding with messages like > "SecretManager$InvalidToken: access control error while attempting to set up > short-circuit access to ... is expired". These logs > correspond with responseTooSlow WARNings from the region server. > {noformat} > 2016-01-19 22:10:14,432 INFO > [B.defaultRpcServer.handler=4,queue=1,port=16020] > shortcircuit.ShortCircuitCache: ShortCircuitCache(0x71bdc547): could not load > 1074037633_BP-1145309065-XXX-1448053136416 due to InvalidToken exception. > org.apache.hadoop.security.token.SecretManager$InvalidToken: access control > error while attempting to set up short-circuit access to token > with block_token_identifier (expiryDate=1453194430724, keyId=1508822027, > userId=hbase, blockPoolId=BP-1145309065-XXX-1448053136416, > blockId=1074037633, access modes=[READ]) is expired. > at > org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591) > at > org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) > at java.io.DataInputStream.read(DataInputStream.java:149) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437) > ... > {noformat} > A potential solution could be to have a background thread that makes a best > effort to proactively refreshes tokens in the cache before they expire, so as > to minimize latency impact on the critical path. > Thanks to [~cnauroth] for providing an explaination and suggesting a solution > over on the [user > list|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11697) Javadoc of erasure coding policy in file status
[ https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982864#comment-15982864 ] Takanobu Asanuma edited comment on HDFS-11697 at 4/25/17 1:18 PM: -- Thanks for the patch, [~lewuathe]. * The javadoc of {{HdfsFileStatus}} constructor also misses the other arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them? * This will raise a checkstyle wanrning, "First sentence should end with a period." {code} /** * Get the erasure coding policy if it's set * @return the erasure coding policy */ {code} There are six same warnings in other places in this file. Let's fix them. was (Author: tasanuma0829): Thanks for the patch, [~lewuathe]. * The javadoc of {{HdfsFileStatus}} constructor also misses the other arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them? * This will raise a checkstyle wanrning, "First sentence should end with a period." {code} /** * Get the erasure coding policy if it's set * @return the erasure coding policy */ {code} There are five same warnings in other places in this file. Let's fix them. > Javadoc of erasure coding policy in file status > --- > > Key: HDFS-11697 > URL: https://issues.apache.org/jira/browse/HDFS-11697 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HDFS-11697.01.patch > > > Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in > javadoc explicitly as well as storage policy. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11697) Javadoc of erasure coding policy in file status
[ https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982864#comment-15982864 ] Takanobu Asanuma commented on HDFS-11697: - Thanks for the patch, [~lewuathe]. * The javadoc of {{HdfsFileStatus}} constructor also misses the other arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them? * This will raise a checkstyle wanrning, "First sentence should end with a period." {code} /** * Get the erasure coding policy if it's set * @return the erasure coding policy */ {code} There are five same warnings in other places in this file. Let's fix them. > Javadoc of erasure coding policy in file status > --- > > Key: HDFS-11697 > URL: https://issues.apache.org/jira/browse/HDFS-11697 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HDFS-11697.01.patch > > > Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in > javadoc explicitly as well as storage policy. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10999) Introduce separate stats for Replicated and Erasure Coded Blocks apart from the current Aggregated stats
[ https://issues.apache.org/jira/browse/HDFS-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982805#comment-15982805 ] Takanobu Asanuma commented on HDFS-10999: - Hi, [~manojg]. Thanks for uploading the patch. I have checked almost all changed code. Change of the interfaces accompanying the additions of new MBeans and Stats looks good to me (non-binding). Some comments for the detailed implementations: *BlockManagerSafeMode.java:* * How about using {{LongAccumulator}} for {{numberOfBytesInFutureBlocks}}, too? *CorruptReplicasMap.java:* * Should this {{decrementBlockStat}} be included in the if statement? {code:java} if (datanodes.isEmpty()) { // remove the block if there is no more corrupted replicas corruptReplicasMap.remove(blk); decrementBlockStat(blk); } {code} * It seems package private is enough for new methods {{getCorruptReplicatedBlocksStat}} and {{getCorruptStripedBlocksStat}}. *InvalidateBlocks.java and LowRedundancyBlocks.java:* Sorry, but I still need more time to review this code. *For unit tests:* I think it would be good if we add more unit tests for these changes in this jira or follow-on jiras. * Add more validations for new metrics in {{TestComputeInvalidateWork}}, {{TestCorruptReplicaInfo}} and {{TestLowRedundancyBlockQueues}}. * {{TestUnderReplicatedBlocks}} covers only replicated files. If we use {{DFSTestUtil#verifyClientStats}} in {{TestReconstructStripedBlocks}}, we may be able to cover the EC case. > Introduce separate stats for Replicated and Erasure Coded Blocks apart from > the current Aggregated stats > > > Key: HDFS-10999 > URL: https://issues.apache.org/jira/browse/HDFS-10999 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Manoj Govindassamy > Labels: hdfs-ec-3.0-nice-to-have, supportability > Attachments: HDFS-10999.01.patch, HDFS-10999.02.patch > > > Per HDFS-9857, it seems in the Hadoop 3 world, people prefer the more generic > term "low redundancy" to the old-fashioned "under replicated". But this term > is still being used in messages in several places, such as web ui, dfsadmin > and fsck. We should probably change them to avoid confusion. > File this jira to discuss it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers
[ https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-11580: - Attachment: HDFS-11580-HDFS-7240.004.patch Thanks [~anu], [~vagarychen] and [~msingh] for the great comments! All the comments make sense to me. The following are my comments: 1. {quote} Do you think we should have 2 functions, like readChunk and readChunkAsync so that it looks more like java 8-ish ? rather than a boolean flag ? {quote} Has addressed this in the latest patch. 2. The problem of that the current async calls still seem like synchronous calls. Yes, I think this should be a problem here. As [~vagarychen] mentioned, we should not invoke {{get()}} in the same thread. Maybe we can register a callback or other way, I take a look into {{CompletableFuture}}, there are already many APIs we can use for this case. In my latest patch, I used one APIs named {{CompletableFuture#thenApply}} to deal with the future result asynchronous and return the new CompletableFuture object. This should be the right way to return the CompletableFuture object for client and let client to call future.get(). 3. {quote} Also because of the async nature of the interface responses, need not be in the same order as the requests. We will need a method to match the response to the replies. {quote} This is a good catch. If the async interface introduced, we should be more carefully to get corresponding response of each request. In my latest patch, I defined a new map to store the pending response. The more details can see in the method {{XceiverClientHandler#waitForResponse}}. 4. {quote} With an async interface, we will always need to keep an eye on the queue depth.. {quote} Good idea. But I'd like to do this work in another JIRA since current patch seems a little big now, :). I have changed many places in the latest patch. Any comment/suggestion are welcomed. > Ozone: Support asynchronus client API for SCM and containers > > > Key: HDFS-11580 > URL: https://issues.apache.org/jira/browse/HDFS-11580 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Yiqun Lin > Attachments: HDFS-11580-HDFS-7240.001.patch, > HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, > HDFS-11580-HDFS-7240.004.patch > > > This is an umbrella JIRA that needs to support a set of APIs in Asynchronous > form. > For containers -- or the datanode API currently supports a call > {{sendCommand}}. we need to build proper programming interface and support an > async interface. > There is also a set of SCM API that clients can call, it would be nice to > support Async interface for those too. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang
[ https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982768#comment-15982768 ] xupeng commented on HDFS-11192: --- Hi all: Is there any update on this issue ? I have also run into the same condition HDFS-8865 resolved. And wanna know if i can merge HDFS-8865 to my repo ? > OOM during Quota Initialization lead to Namenode hang > - > > Key: HDFS-11192 > URL: https://issues.apache.org/jira/browse/HDFS-11192 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: namenodeThreadDump.out > > > AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or > not able to create,it will not notify the parent.Parent still waiting for the > notify call..that's not timed waiting also. > *Trace from Namenode log* > {noformat} > Exception in thread "ForkJoinPool-1-worker-2" Exception in thread > "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new > native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486) > at > java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517) > at > java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167) > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486) > at > java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517) > at > java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11373) Backport HDFS-11258 and HDFS-11272 to branch-2.7
[ https://issues.apache.org/jira/browse/HDFS-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982651#comment-15982651 ] Brahma Reddy Battula commented on HDFS-11373: - [~ajisakaa] thanks for reporting this.. Patch LGTM. will re-trigger the jenkins. > Backport HDFS-11258 and HDFS-11272 to branch-2.7 > > > Key: HDFS-11373 > URL: https://issues.apache.org/jira/browse/HDFS-11373 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Critical > Attachments: HDFS-11373-branch-2.7.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-7535) Utilize Snapshot diff report for distcp
[ https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982590#comment-15982590 ] Benjamin Huo edited comment on HDFS-7535 at 4/25/17 9:01 AM: - I've one question regarding the following comments: "This snapshot diff report represents the delta that should be applied to the backup cluster. For changes like deletion and rename we can directly apply the same operations (following some specific order based on their dependency) in the backup cluster. For changes like creation, append, and other metadata modification we keep using the functionality of the current distcp." I'm not very clear about what "we keep using the functionality of the current distcp" means. After fix HDFS-7535, the file changes list for creation and modification are generated based on snapshots s1 and s2 on the source cluster, or it's generated based on the file changes between source cluster and destination cluster(with extra cost to transfer file list between source and target cluster )? Thanks Ben was (Author: benjaminh): I've one question regarding the following comments: "This snapshot diff report represents the delta that should be applied to the backup cluster. For changes like deletion and rename we can directly apply the same operations (following some specific order based on their dependency) in the backup cluster. For changes like creation, append, and other metadata modification we keep using the functionality of the current distcp." I'm not very clear about what "we keep using the functionality of the current distcp" means. After fix HDFS-7535, the file changes list for creation and modification are generated based on snapshots s1 and s2 on the source cluster, or it's generated based on the file changes between source cluster and destination cluster? Thanks Ben > Utilize Snapshot diff report for distcp > --- > > Key: HDFS-7535 > URL: https://issues.apache.org/jira/browse/HDFS-7535 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 2.7.0 > > Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, > HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch > > > Currently HDFS snapshot diff report can identify file/directory creation, > deletion, rename and modification under a snapshottable directory. We can use > the diff report for distcp between the primary cluster and a backup cluster > to avoid unnecessary data copy. This is especially useful when there is a big > directory rename happening in the primary cluster: the current distcp cannot > detect the rename op thus this rename usually leads to large amounts of real > data copy. > More details of the approach will come in the first comment. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7535) Utilize Snapshot diff report for distcp
[ https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982590#comment-15982590 ] Benjamin Huo commented on HDFS-7535: I've one question regarding the following comments: "This snapshot diff report represents the delta that should be applied to the backup cluster. For changes like deletion and rename we can directly apply the same operations (following some specific order based on their dependency) in the backup cluster. For changes like creation, append, and other metadata modification we keep using the functionality of the current distcp." I'm not very clear about what "we keep using the functionality of the current distcp" means. After fix HDFS-7535, the file changes list for creation and modification are generated based on snapshots s1 and s2 on the source cluster, or it's generated based on the file changes between source cluster and destination cluster? Thanks Ben > Utilize Snapshot diff report for distcp > --- > > Key: HDFS-7535 > URL: https://issues.apache.org/jira/browse/HDFS-7535 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 2.7.0 > > Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, > HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch > > > Currently HDFS snapshot diff report can identify file/directory creation, > deletion, rename and modification under a snapshottable directory. We can use > the diff report for distcp between the primary cluster and a backup cluster > to avoid unnecessary data copy. This is especially useful when there is a big > directory rename happening in the primary cluster: the current distcp cannot > detect the rename op thus this rename usually leads to large amounts of real > data copy. > More details of the approach will come in the first comment. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6708) StorageType should be encoded in the block token
[ https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982524#comment-15982524 ] Chris Douglas commented on HDFS-6708: - [~ehiggs], you're right, that's not going to work with HDFS-9807. Changed it back to {{StorageType[]}} > StorageType should be encoded in the block token > > > Key: HDFS-6708 > URL: https://issues.apache.org/jira/browse/HDFS-6708 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 2.4.1 >Reporter: Arpit Agarwal >Assignee: Ewan Higgs > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-6708.0001.patch, HDFS-6708.0002.patch, > HDFS-6708.0003.patch, HDFS-6708.0004.patch, HDFS-6708.0005.patch, > HDFS-6708.0006.patch, HDFS-6708.0007.patch, HDFS-6708.0008.patch, > HDFS-6708.0009.patch, HDFS-6708.0010.patch > > > HDFS-6702 is adding support for file creation based on StorageType. > The block token is used as a tamper-proof channel for communicating block > parameters from the NN to the DN during block creation. The StorageType > should be included in this block token. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6708) StorageType should be encoded in the block token
[ https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-6708: Attachment: HDFS-6708.0010.patch > StorageType should be encoded in the block token > > > Key: HDFS-6708 > URL: https://issues.apache.org/jira/browse/HDFS-6708 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 2.4.1 >Reporter: Arpit Agarwal >Assignee: Ewan Higgs > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-6708.0001.patch, HDFS-6708.0002.patch, > HDFS-6708.0003.patch, HDFS-6708.0004.patch, HDFS-6708.0005.patch, > HDFS-6708.0006.patch, HDFS-6708.0007.patch, HDFS-6708.0008.patch, > HDFS-6708.0009.patch, HDFS-6708.0010.patch > > > HDFS-6702 is adding support for file creation based on StorageType. > The block token is used as a tamper-proof channel for communicating block > parameters from the NN to the DN during block creation. The StorageType > should be included in this block token. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11609) Some blocks can be permanently lost if nodes are decommissioned while dead
[ https://issues.apache.org/jira/browse/HDFS-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-11609: -- Target Version/s: 2.7.4, 2.8.1 (was: 2.8.1) > Some blocks can be permanently lost if nodes are decommissioned while dead > -- > > Key: HDFS-11609 > URL: https://issues.apache.org/jira/browse/HDFS-11609 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-11609.branch-2.patch, HDFS-11609.trunk.patch, > HDFS-11609_v2.branch-2.patch, HDFS-11609_v2.trunk.patch > > > When all the nodes containing a replica of a block are decommissioned while > they are dead, they get decommissioned right away even if there are missing > blocks. This behavior was introduced by HDFS-7374. > The problem starts when those decommissioned nodes are brought back online. > The namenode no longer shows missing blocks, which creates a false sense of > cluster health. When the decommissioned nodes are removed and reformatted, > the block data is permanently lost. The namenode will report missing blocks > after the heartbeat recheck interval (e.g. 10 minutes) from the moment the > last node is taken down. > There are multiple issues in the code. As some cause different behaviors in > testing vs. production, it took a while to reproduce it in a unit test. I > will present analysis and proposal soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11695) [SPS]: Namenode failed to start while loading SPS xAttrs from the edits log.
[ https://issues.apache.org/jira/browse/HDFS-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982436#comment-15982436 ] Uma Maheswara Rao G commented on HDFS-11695: Thanks [~surendrasingh] for reporting it. Can you explain me the scenario when its coming? I think you are right. We can reproduce this case in the following case: call satisfyStoragePolicy on one directory first. Then try calling satisfy policy on parent directory, here it will restricts to satisfy policy as sub directory already has Xattr. But you can explain the case how you got while starting NN. Do you want to fix it? Let me know if any help > [SPS]: Namenode failed to start while loading SPS xAttrs from the edits log. > > > Key: HDFS-11695 > URL: https://issues.apache.org/jira/browse/HDFS-11695 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-10285 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Blocker > Attachments: fsimage.xml > > > {noformat} > 2017-04-23 13:27:51,971 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.io.IOException: Cannot request to call satisfy storage policy on path > /ssl, as this file/dir was already called for satisfying storage policy. > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSatisfyStoragePolicy(FSDirAttrOp.java:511) > at > org.apache.hadoop.hdfs.server.namenode.FSDirXAttrOp.unprotectedSetXAttrs(FSDirXAttrOp.java:284) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:918) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:241) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:150) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
[ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982410#comment-15982410 ] Zhe Zhang commented on HDFS-11384: -- Thanks for the update [~shv]. Now all other tests in {{TestBalancer}} pass except for {{testBalancerRPCDelay}}: {code} java.util.concurrent.TimeoutException: Timed out waiting for /tmp.txt to reach 40 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:764) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.createFile(TestBalancer.java:306) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:847) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerRPCDelay(TestBalancer.java:2071) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {code} > Add option for balancer to disperse getBlocks calls to avoid NameNode's > rpc.CallQueueLength spike > - > > Key: HDFS-11384 > URL: https://issues.apache.org/jira/browse/HDFS-11384 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.3 >Reporter: yunjiong zhao >Assignee: Konstantin Shvachko > Attachments: balancer.day.png, balancer.week.png, > HDFS-11384.001.patch, HDFS-11384.002.patch, HDFS-11384.003.patch, > HDFS-11384.004.patch, HDFS-11384.005.patch, HDFS-11384.006.patch, > HDFS-11384-007.patch, HDFS-11384.008.patch > > > When running balancer on hadoop cluster which have more than 3000 Datanodes > will cause NameNode's rpc.CallQueueLength spike. We observed this situation > could cause Hbase cluster failure due to RegionServer's WAL timeout. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org