[jira] [Commented] (HDFS-12913) TestDNFencingWithReplication.testFencingStress fix mini cluster not yet active issue

2018-01-04 Thread Zsolt Venczel (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311308#comment-16311308
 ] 

Zsolt Venczel commented on HDFS-12913:
--

Thanks for the review [~msingh]!

> TestDNFencingWithReplication.testFencingStress fix mini cluster not yet 
> active issue
> 
>
> Key: HDFS-12913
> URL: https://issues.apache.org/jira/browse/HDFS-12913
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>  Labels: flaky-test
> Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch
>
>
> Once in every 5000 test run the following issue happens:
> {code}
> 2017-12-11 10:33:09 [INFO] 
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO]  T E S T S
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO] Running 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 262.641 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] 
> testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication)
>   Time elapsed: 262.477 s  <<< ERROR!
> 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137)
> 2017-12-11 10:37:32   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> 2017-12-11 10:37:32   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2017-12-11 10:37:32   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2017-12-11 10:37:32   at java.lang.reflect.Method.invoke(Method.java:498)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
> 2017-12-11 10:37:32   at 
> 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-04 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311247#comment-16311247
 ] 

Rakesh R commented on HDFS-12911:
-

FYI, as per discussion with Uma, few tasks has been re-prioritized to make the 
refactoring easy, (1) push lock optimization HDFS-12982 patch to the branch 
first and (2) after that [HDFS-12911 
patch|https://issues.apache.org/jira/secure/attachment/12903981/HDFS-12911-HDFS-10285-02.patch].
 Please excuse, if there is any confusions due to changing focus to another 
jira from here.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"

2018-01-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311238#comment-16311238
 ] 

Brahma Reddy Battula commented on HDFS-6804:


[~jojochuang] can you please look once.Hope it's ready for commit.

> race condition between transferring block and appending block causes 
> "Unexpected checksum mismatch exception" 
> --
>
> Key: HDFS-6804
> URL: https://issues.apache.org/jira/browse/HDFS-6804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: Gordon Wang
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-6804-branch-2.8-002.patch, 
> HDFS-6804-branch-2.8-003.patch, HDFS-6804-branch-2.8.patch, 
> Testcase_append_transfer_block.patch
>
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException: 
> Unexpected checksum mismatch while writing 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from 
> /192.168.2.101:39495
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Da
> taTransfer: Transmitted 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad 
> block to NameNode and NameNode marks the replica on the source datanode as 
> corrupt. But actually, the replica on the source datanode is valid. Because 
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2018-01-04 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311162#comment-16311162
 ] 

Shashikant Banerjee edited comment on HDFS-11225 at 1/4/18 10:49 AM:
-

[~manojg] and [~jingzhao]/others, please have a look at the proposal.


was (Author: shashikant):
[~manojg] and [~jingzhao], please have a look at the proposal.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
> Attachments: Snaphot_Deletion_Design_Proposal.pdf
>
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and sent the edit to journal nodes, but it was 
> rejected because epoch was updated. See the following stacktrace:
> {noformat}
> 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
> at 
> 

[jira] [Commented] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2018-01-04 Thread Shashikant Banerjee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311162#comment-16311162
 ] 

Shashikant Banerjee commented on HDFS-11225:


[~manojg] and [~jingzhao], please have a look at the proposal.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
> Attachments: Snaphot_Deletion_Design_Proposal.pdf
>
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and sent the edit to journal nodes, but it was 
> rejected because epoch was updated. See the following stacktrace:
> {noformat}
> 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:149)
> at 
> 

[jira] [Commented] (HDFS-12987) Document - Disabling the Lazy persist file scrubber.

2018-01-04 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311142#comment-16311142
 ] 

Surendra Singh Lilhore commented on HDFS-12987:
---

+1, pending Jenkins.. 

> Document - Disabling the Lazy persist file scrubber.
> 
>
> Key: HDFS-12987
> URL: https://issues.apache.org/jira/browse/HDFS-12987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, hdfs
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Trivial
>  Labels: documentation
> Attachments: HDFS-12987.patch
>
>
> LazyPersistFileScrubber will be disabled if scrubber interval configured to 
> zero. But the document was incorrect - "Set it to a negative value to disable 
> this behavior". 
> Namenode will not start if we set to negative value. We have another strong 
> check to prevent it.
> {code:java}
>if (this.lazyPersistFileScrubIntervalSec < 0) {
> throw new IllegalArgumentException(
> DFS_NAMENODE_LAZY_PERSIST_FILE_SCRUB_INTERVAL_SEC
> + " must be zero (for disable) or greater than zero.");
>   }
> {code}
> This seems reflected due to 
> [HDFS-8276|https://issues.apache.org/jira/browse/HDFS-8276]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311105#comment-16311105
 ] 

fang zhenyi edited comment on HDFS-12974 at 1/4/18 10:04 AM:
-

[~xiaochen]:Thank you very much for giving me sugesstion  and happy new year.
Hope you can review again, thanks a lot. 


was (Author: zhenyi):
[~xiaochen]:Thank you very much for giving me sugesstion  and happy new year..
Hope you can review again, thanks a lot. 

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311105#comment-16311105
 ] 

fang zhenyi commented on HDFS-12974:


@xiaochen:Thank you very much for giving me sugesstion  and happy new year..
Hope you can review again, thanks a lot. 

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311105#comment-16311105
 ] 

fang zhenyi edited comment on HDFS-12974 at 1/4/18 10:03 AM:
-

[~xiaochen]:Thank you very much for giving me sugesstion  and happy new year..
Hope you can review again, thanks a lot. 


was (Author: zhenyi):
@xiaochen:Thank you very much for giving me sugesstion  and happy new year..
Hope you can review again, thanks a lot. 

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12987) Document - Disabling the Lazy persist file scrubber.

2018-01-04 Thread Karthik Palanisamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311102#comment-16311102
 ] 

Karthik Palanisamy commented on HDFS-12987:
---

cc: [~surendrasingh]

> Document - Disabling the Lazy persist file scrubber.
> 
>
> Key: HDFS-12987
> URL: https://issues.apache.org/jira/browse/HDFS-12987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, hdfs
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Trivial
>  Labels: documentation
> Attachments: HDFS-12987.patch
>
>
> LazyPersistFileScrubber will be disabled if scrubber interval configured to 
> zero. But the document was incorrect - "Set it to a negative value to disable 
> this behavior". 
> Namenode will not start if we set to negative value. We have another strong 
> check to prevent it.
> {code:java}
>if (this.lazyPersistFileScrubIntervalSec < 0) {
> throw new IllegalArgumentException(
> DFS_NAMENODE_LAZY_PERSIST_FILE_SCRUB_INTERVAL_SEC
> + " must be zero (for disable) or greater than zero.");
>   }
> {code}
> This seems reflected due to 
> [HDFS-8276|https://issues.apache.org/jira/browse/HDFS-8276]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12987) Document - Disabling the Lazy persist file scrubber.

2018-01-04 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-12987:
--
Labels: documentation  (was: )
Status: Patch Available  (was: Open)

> Document - Disabling the Lazy persist file scrubber.
> 
>
> Key: HDFS-12987
> URL: https://issues.apache.org/jira/browse/HDFS-12987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, hdfs
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Trivial
>  Labels: documentation
> Attachments: HDFS-12987.patch
>
>
> LazyPersistFileScrubber will be disabled if scrubber interval configured to 
> zero. But the document was incorrect - "Set it to a negative value to disable 
> this behavior". 
> Namenode will not start if we set to negative value. We have another strong 
> check to prevent it.
> {code:java}
>if (this.lazyPersistFileScrubIntervalSec < 0) {
> throw new IllegalArgumentException(
> DFS_NAMENODE_LAZY_PERSIST_FILE_SCRUB_INTERVAL_SEC
> + " must be zero (for disable) or greater than zero.");
>   }
> {code}
> This seems reflected due to 
> [HDFS-8276|https://issues.apache.org/jira/browse/HDFS-8276]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fang zhenyi updated HDFS-12974:
---
Fix Version/s: (was: 3.1.0)
   Status: Patch Available  (was: In Progress)

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fang zhenyi updated HDFS-12974:
---
Status: In Progress  (was: Patch Available)

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12987) Document - Disabling the Lazy persist file scrubber.

2018-01-04 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-12987:
--
 Flags: Patch
Attachment: HDFS-12987.patch

> Document - Disabling the Lazy persist file scrubber.
> 
>
> Key: HDFS-12987
> URL: https://issues.apache.org/jira/browse/HDFS-12987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, hdfs
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Trivial
> Attachments: HDFS-12987.patch
>
>
> LazyPersistFileScrubber will be disabled if scrubber interval configured to 
> zero. But the document was incorrect - "Set it to a negative value to disable 
> this behavior". 
> Namenode will not start if we set to negative value. We have another strong 
> check to prevent it.
> {code:java}
>if (this.lazyPersistFileScrubIntervalSec < 0) {
> throw new IllegalArgumentException(
> DFS_NAMENODE_LAZY_PERSIST_FILE_SCRUB_INTERVAL_SEC
> + " must be zero (for disable) or greater than zero.");
>   }
> {code}
> This seems reflected due to 
> [HDFS-8276|https://issues.apache.org/jira/browse/HDFS-8276]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12974) Exception information can not be returned when I create transparent encryption zone.

2018-01-04 Thread fang zhenyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fang zhenyi updated HDFS-12974:
---
Attachment: HDFS-12974.003.patch

> Exception information can not be returned when I create transparent 
> encryption zone.
> 
>
> Key: HDFS-12974
> URL: https://issues.apache.org/jira/browse/HDFS-12974
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0
>Reporter: fang zhenyi
>Assignee: fang zhenyi
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HDFS-12974.001.patch, HDFS-12974.002.patch, 
> HDFS-12974.003.patch
>
>
> When I add the following configuration to the kms-acl.xml file, I create 
> encrypted space and I can not get any exception information.
> 
>   key.acl.key2.GENERATE_EEK
>   mr
> 
> root@fangzhenyi01:~# hdfs crypto -createZone -keyName key2 -path /zone
> 2018-01-02 10:41:44,632 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> RemoteException: 
> root@fangzhenyi01:~# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12987) Document - Disabling the Lazy persist file scrubber.

2018-01-04 Thread Karthik Palanisamy (JIRA)
Karthik Palanisamy created HDFS-12987:
-

 Summary: Document - Disabling the Lazy persist file scrubber.
 Key: HDFS-12987
 URL: https://issues.apache.org/jira/browse/HDFS-12987
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, hdfs
Reporter: Karthik Palanisamy
Assignee: Karthik Palanisamy
Priority: Trivial


LazyPersistFileScrubber will be disabled if scrubber interval configured to 
zero. But the document was incorrect - "Set it to a negative value to disable 
this behavior". 

Namenode will not start if we set to negative value. We have another strong 
check to prevent it.

{code:java}
   if (this.lazyPersistFileScrubIntervalSec < 0) {
throw new IllegalArgumentException(
DFS_NAMENODE_LAZY_PERSIST_FILE_SCRUB_INTERVAL_SEC
+ " must be zero (for disable) or greater than zero.");
  }
{code}

This seems reflected due to 
[HDFS-8276|https://issues.apache.org/jira/browse/HDFS-8276]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-01-04 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311087#comment-16311087
 ] 

Jianfei Jiang commented on HDFS-12935:
--

Thanks to [~brahmareddy] for reviewing and happy new year. I am sorry for 
misunderstanding your meaning about HDFS-8277 at first time. I have uploaded a 
new patch 004 and fix the following:

1. Remove the changes related to safemode.
2. update -setBalancerBandwidth command, only set to one namenode.

The three failure testcases are not related and I have re-run again 
successfully.
Hope you can review again, thanks a lot.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12986) Ozone: Update ozone to latest ratis snapshot build (0.1.1-alpha-00f80b4-SNAPSHOT)

2018-01-04 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned HDFS-12986:


Assignee: Lokesh Jain  (was: Mukul Kumar Singh)

> Ozone: Update ozone to latest ratis snapshot build 
> (0.1.1-alpha-00f80b4-SNAPSHOT)
> -
>
> Key: HDFS-12986
> URL: https://issues.apache.org/jira/browse/HDFS-12986
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
> Fix For: HDFS-7240
>
>
> This jira will update ozone to latest snapshot 
> release-0.1.1-alpha-00f80b4-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12860) StripedBlockUtil#getRangesInternalBlocks throws exception for the block group size larger than 2GB

2018-01-04 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311064#comment-16311064
 ] 

SammiChen commented on HDFS-12860:
--

Just come back from a long vacation. Sorry for the late response. 

Thanks [~eddyxu] for refine the test case. It's more clear and readable now. 
For end-to-end tests, I worried about if it's the only 2GB related bug in EC 
code. Anyway, with current title scope, I'm good with the current test 
coverage.  

My + 1 for the patch. 

> StripedBlockUtil#getRangesInternalBlocks throws exception for the block group 
> size larger than 2GB
> --
>
> Key: HDFS-12860
> URL: https://issues.apache.org/jira/browse/HDFS-12860
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12860.00.patch, HDFS-12860.01.patch
>
>
> Running terasort on a cluster with 8 datanodes, 256GB data, using 
> RS-3-2-1024k.
> The test data was generated by {{teragen}} with 32 mappers.
> The terasort benchmark fails with the following stack trace:
> {code}
> 17/11/27 14:44:31 INFO mapreduce.Job:  map 45% reduce 0%
> 17/11/27 14:44:33 INFO mapreduce.Job: Task Id : 
> attempt_1510080297865_0160_m_08_0, Status : FAILED
> Error: java.lang.IllegalArgumentException
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil$VerticalRange.(StripedBlockUtil.java:701)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getRangesForInternalBlocks(StripedBlockUtil.java:442)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(StripedBlockUtil.java:311)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:308)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-01-04 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311055#comment-16311055
 ] 

genericqa commented on HDFS-12935:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12935 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12904522/HDFS-12935.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c0f55cea747d 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2a48b35 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22550/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22550/testReport/ |
| Max. process+thread count | 2964 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 

[jira] [Created] (HDFS-12986) Ozone: Update ozone to latest ratis snapshot build (0.1.1-alpha-00f80b4-SNAPSHOT)

2018-01-04 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDFS-12986:


 Summary: Ozone: Update ozone to latest ratis snapshot build 
(0.1.1-alpha-00f80b4-SNAPSHOT)
 Key: HDFS-12986
 URL: https://issues.apache.org/jira/browse/HDFS-12986
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: HDFS-7240


This jira will update ozone to latest snapshot 
release-0.1.1-alpha-00f80b4-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    1   2