[jira] [Reopened] (HDFS-7346) Erasure Coding: perform stripping erasure encoding work given block reader and writer
[ https://issues.apache.org/jira/browse/HDFS-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo reopened HDFS-7346: - > Erasure Coding: perform stripping erasure encoding work given block reader > and writer > - > > Key: HDFS-7346 > URL: https://issues.apache.org/jira/browse/HDFS-7346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Kai Zheng >Assignee: Li Bo > > This assumes the facilities like block reader and writer are ready, > implements and performs erasure encoding work in *stripping* case utilizing > erasure codec and coder provided by the codec framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files
[ https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095978#comment-15095978 ] Kai Zheng commented on HDFS-8430: - Well, I may wait some other days for some comments. Anyhow, would proceed in next week and provide a formal patch as I summarized above. In the initial version, very probably: * for the new API {{getFileChecksum}}, it may try distributing the computing task to DataNode to avoid network congestion in the client as Nicholas said; * will use the current MD5MD5CRC32 approach, not use CRC64 and leave it for subsequent revision or follow-on task according to review comments. > Erasure coding: compute file checksum for stripe files > -- > > Key: HDFS-8430 > URL: https://issues.apache.org/jira/browse/HDFS-8430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Kai Zheng > Attachments: HDFS-8430-poc1.patch > > > HADOOP-3981 introduces a distributed file checksum algorithm. It's designed > for replicated block. > {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped > block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7346) Erasure Coding: perform stripping erasure encoding work given block reader and writer
[ https://issues.apache.org/jira/browse/HDFS-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-7346: Release Note: (was: The jira is very old and close it because we'll not handle it in the near future.) > Erasure Coding: perform stripping erasure encoding work given block reader > and writer > - > > Key: HDFS-7346 > URL: https://issues.apache.org/jira/browse/HDFS-7346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Kai Zheng >Assignee: Li Bo > > This assumes the facilities like block reader and writer are ready, > implements and performs erasure encoding work in *stripping* case utilizing > erasure codec and coder provided by the codec framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096192#comment-15096192 ] Daniel Templeton commented on HDFS-9415: ' * ' is valid? I thought only one space was allowed, or at least specified to be allowed. > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-9643.HDFS-8707.000.patch Initial patch, I've manually tested it but need to sort out the failures hidden by HDFS-9610 before I can write decent unit tests. Open questions: -Right now the cancel logic is added directly to each continuation in the remote block reader. On one hand this is simple and works, on the other it's boilerplate code. Is this worth pushing into the continuation pipeline code at the moment? I think it's worth keeping it simple until NN operations become cancelable. -In this implementation FileHandle::CancelOperations is irreversible and prevents it from being used again. Can anyone think of a reason not to have it also close the file or at least clear vector? -Should the FileHandle have a callback when it knows that there are no pending operations? Should be possible to just check the reference count on the CancelHandle to verify. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode
[ https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096827#comment-15096827 ] Chris Trezzo commented on HDFS-1312: I will dial into the call as well. Thanks for posting. > Re-balance disks within a Datanode > -- > > Key: HDFS-1312 > URL: https://issues.apache.org/jira/browse/HDFS-1312 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode >Reporter: Travis Crawford >Assignee: Anu Engineer > Attachments: Architecture_and_testplan.pdf, disk-balancer-proposal.pdf > > > Filing this issue in response to ``full disk woes`` on hdfs-user. > Datanodes fill their storage directories unevenly, leading to situations > where certain disks are full while others are significantly less used. Users > at many different sites have experienced this issue, and HDFS administrators > are taking steps like: > - Manually rebalancing blocks in storage directories > - Decomissioning nodes & later readding them > There's a tradeoff between making use of all available spindles, and filling > disks at the sameish rate. Possible solutions include: > - Weighting less-used disks heavier when placing new blocks on the datanode. > In write-heavy environments this will still make use of all spindles, > equalizing disk use over time. > - Rebalancing blocks locally. This would help equalize disk use as disks are > added/replaced in older cluster nodes. > Datanodes should actively manage their local disk so operator intervention is > not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations
[ https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096845#comment-15096845 ] Andrew Wang commented on HDFS-9624: --- Hi [~linyiqun] thanks for revving the patch, First two fixes look good, but the test needs a little more work. What I meant by timer injection is something like the org.apache.hadoop.util.Timer class, it lets you explicitly advance the time rather than having to wait for the system clock to advance. This means the test will run in milliseconds instead of seconds, which is a lot faster. There are some examples for how to mock Timer in other unit tests, let me know if it's still unclear though. > DataNode start slowly due to the initial DU command operations > -- > > Key: HDFS-9624 > URL: https://issues.apache.org/jira/browse/HDFS-9624 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, > HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch > > > It seems starting datanode so slowly when I am finishing migration of > datanodes and restart them.I look the dn logs: > {code} > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4 > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK > 2016-01-06 16:05:08,176 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Registered FSDatasetState MBean > 2016-01-06 16:05:08,177 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 > 2016-01-06 16:05:08,178 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data2/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data3/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data4/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data5/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data6/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data7/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data8/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data9/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data10/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data11/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data12/current... > 2016-01-06 16:09:49,646 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data7/current: 281466ms > 2016-01-06 16:09:54,235 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on >
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Status: Patch Available (was: Open) > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-6221) Webhdfs should recover from dead DNs
[ https://issues.apache.org/jira/browse/HDFS-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-6221. -- Resolution: Not A Problem > Webhdfs should recover from dead DNs > > > Key: HDFS-6221 > URL: https://issues.apache.org/jira/browse/HDFS-6221 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, webhdfs >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > > We've repeatedly observed the jetty acceptor thread silently dying in the > DNs. The webhdfs servlet may also "disappear" and jetty returns non-json > 404s. > One approach to make webhdfs more resilient to bad DNs is dfsclient-like > fetching of block locations to directly access the DNs instead of relying on > a NN redirect that may repeatedly send the client to the same faulty DN(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096969#comment-15096969 ] Hadoop QA commented on HDFS-9643: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} docker {color} | {color:red} 7m 35s {color} | {color:red} Docker failed to build yetus/hadoop:0cf5e66. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782121/HDFS-9643.HDFS-8707.000.patch | | JIRA Issue | HDFS-9643 | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14115/console | This message was automatically generated. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9613) Avoid checking file checksums after copy when possible
[ https://issues.apache.org/jira/browse/HDFS-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096784#comment-15096784 ] Yongjun Zhang commented on HDFS-9613: - HI [~drankye], Thanks for clarifying and sorry for my delayed reply. I was stuck with some criitcal issue. I did not have time to do a very thorough review, but some comments here. # Good idea to separate out the clean-up code (including most of the change of import statements) to a different jira HDFS-9630. Suggest to prune the patch of this jira to only address checksum checking change. # Seems the conditions that need to be checked whether checksum comparison is needed are: ## cond0: whether file system supports checksum ## cond1: skipCrc ## cond2: fileAttributes.contains(FileAttribute.CHECKSUMTYPE) ## cond3: fileAttributes.contains(FileAttribute.BLOCKSIZE) ## cond4: (sourceFileStatus.getBlockSize() == targetFS.getDefaultBlockSize(targetPath)) # Some derived logic: ## !cond0 ==> cond1 will be ignored ## if cond2 is true, even if cond1 is true, we still compare checksum, which is not intuitive. Should issue warn msg at parameter checking stage. ## if cond2 implies cond3, we probably need to enforce cond3 is true when cond2 is true at parameter checking time, but this enforcement may be not backward compatible. At least need to issue warn message. ## cond3 ==> cond4 The combined logic may be: * boolean needToCompareChecksum = cond0 && ((!cond1) || cond2) && (cond3 || cond4); I may be wrong here, but wonder if this makes sense. Hi [~jingzhao], thanks for your earlier comment, welcome to discuss further. Thanks. > Avoid checking file checksums after copy when possible > -- > > Key: HDFS-9613 > URL: https://issues.apache.org/jira/browse/HDFS-9613 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kai Zheng >Assignee: Kai Zheng >Priority: Minor > Attachments: HDFS-9613-v1.patch, HDFS-9613-v2.patch > > > While working on related issue, it was noticed there are some places in > {{distcp}} that's better to be improved and cleaned up. Particularly, after a > file is coped to target cluster, it will check the copied file is fine or > not. For replicated files, when checking, if the source block size and > checksum option are not preserved while copying, we can avoid comparing the > file checksums, which may save some time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9094) Add command line option to ask NameNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-9094: Attachment: HDFS-9094-HDFS-9000.004.patch > Add command line option to ask NameNode reload configuration. > - > > Key: HDFS-9094 > URL: https://issues.apache.org/jira/browse/HDFS-9094 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9094-HDFS-9000.002.patch, > HDFS-9094-HDFS-9000.003.patch, HDFS-9094-HDFS-9000.004.patch, > HDFS-9094.001.patch > > > This work is going to add DFS admin command that allows reloading NameNode > configuration. This is sibling work related to HDFS-6808. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096950#comment-15096950 ] Xiaobing Zhou commented on HDFS-9415: - Yes, that's valid. There is one test case in TestAccessControlList#testWildCardAccessControlList. > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6221) Webhdfs should recover from dead DNs
[ https://issues.apache.org/jira/browse/HDFS-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096830#comment-15096830 ] Kihwal Lee commented on HDFS-6221: -- bq. We've repeatedly observed the jetty acceptor thread silently dying in the DNs After converting DN to use netty, this is no longer a problem. jetty is still there, but only handles infrequent non-webhdfs http requests. > Webhdfs should recover from dead DNs > > > Key: HDFS-6221 > URL: https://issues.apache.org/jira/browse/HDFS-6221 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, webhdfs >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > > We've repeatedly observed the jetty acceptor thread silently dying in the > DNs. The webhdfs servlet may also "disappear" and jetty returns non-json > 404s. > One approach to make webhdfs more resilient to bad DNs is dfsclient-like > fetching of block locations to directly access the DNs instead of relying on > a NN redirect that may repeatedly send the client to the same faulty DN(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load
[ https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096871#comment-15096871 ] Andrew Wang commented on HDFS-9635: --- Hmm, so to clarify, do you plan to extend AvailableSpaceVolumeChoosingPolicy with IO load information, or write a new policy? I'd like to see it in ASVCP if possible, and if you're already using this policy by default, it sounds like this would work for you too. As you mention, IO wait is a great way of measuring load on a disk. We can try to collect it in HDFS, but the OS also exposes IO wait information (e.g. iostat). IMO the OS info is better since it's more complete. The OS is aware of the actual writes to disk, whereas HDFS is getting buffered by page cache. Also, HDFS's IO wait info will only be as up-to-date as the last time it wrote, which is an issue when HDFS shares disks with other apps like MR (common). In any case, I'm sure there'll be some experimentation to find the right signals and thresholds. Looking forward to your findings! > Add one more volume choosing policy with considering volume IO load > --- > > Key: HDFS-9635 > URL: https://issues.apache.org/jira/browse/HDFS-9635 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yong Zhang >Assignee: Yong Zhang > > We have RoundRobinVolumeChoosingPolicy and > AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load. > This jira will add a Add one more volume choosing policy base on how many > xceiver count on volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096997#comment-15096997 ] Arpit Agarwal commented on HDFS-9415: - The test case does make it explicit that we accept {{"*"}} as a valid wildcard. I see no harm in documenting it if we are doing to document the wildcard behavior. > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9094) Add command line option to ask NameNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096858#comment-15096858 ] Xiaobing Zhou commented on HDFS-9094: - Thanks [~arpitagarwal] for review, patch V004 fixed the issues. > Add command line option to ask NameNode reload configuration. > - > > Key: HDFS-9094 > URL: https://issues.apache.org/jira/browse/HDFS-9094 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9094-HDFS-9000.002.patch, > HDFS-9094-HDFS-9000.003.patch, HDFS-9094-HDFS-9000.004.patch, > HDFS-9094.001.patch > > > This work is going to add DFS admin command that allows reloading NameNode > configuration. This is sibling work related to HDFS-6808. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load
[ https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096928#comment-15096928 ] Anu Engineer commented on HDFS-9635: [~java8964] Thanks for comments. I am looking at this from the perspective of HDFS-1312. Both HDFS-8538 and HDFS-1312 tries to minimize the internal disk usage imbalance. That is disks having different amount of data. However if we *only* use Volume IO as the criteria for selection of where the block will be placed then we might rapidly run into an issue where a set of small writes all go to a disk and some large writes to another disk. To avoid that scenario, would it not make sense to actually combine this with HDFS-1804 and solve HDFS-8538. Just like placing a block without considering I/O is not very efficient (as this JIRA illustrates), I think having a block placement policy without considering how much space is left on volume can create other forms of inefficiencies. In fact, I think we might have two incomplete solutions, instead of having a whole working solution. As for the OS level I/O it is an aspirational goal. It is more than enough if we consider only HDFS I/O happening to a volume. Would you please let me know if I am missing any use case for your clusters if we just solve HDFS-8538. > Add one more volume choosing policy with considering volume IO load > --- > > Key: HDFS-9635 > URL: https://issues.apache.org/jira/browse/HDFS-9635 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yong Zhang >Assignee: Yong Zhang > > We have RoundRobinVolumeChoosingPolicy and > AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load. > This jira will add a Add one more volume choosing policy base on how many > xceiver count on volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096968#comment-15096968 ] Daniel Templeton commented on HDFS-9415: Just because there's a test case, it doesn't mean it's a valid configuration. In the case of ' * ', the string is split on the first space, giving a user of '' and a group of '* '. The group is then trimmed before splitting on comma, giving groups of \['*'\]. That's a long way to say that ' * ' is a poorly formatted version of ' *' and hence should not be mentioned in the docs. > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
Wei-Chiu Chuang created HDFS-9648: - Summary: Test TestStartup.testImageChecksum keeps failing Key: HDFS-9648 URL: https://issues.apache.org/jira/browse/HDFS-9648 Project: Hadoop HDFS Issue Type: Bug Environment: Jenkins Reporter: Wei-Chiu Chuang I saw the Jenkins log shows TestStartup.testImageChecksum has been failing consecutively 5 times. https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Description: I saw the Jenkins log shows TestStartup.testImageChecksum has been failing consecutively 5 times. https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ Seems like HDFS-9569 by Yongjun changed exception message, and this test was looking for the exact message. was: I saw the Jenkins log shows TestStartup.testImageChecksum has been failing consecutively 5 times. https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations
[ https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097695#comment-15097695 ] Hadoop QA commented on HDFS-9624: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 22s {color} | {color:red} Patch generated 4 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs (total was 543, now 546). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 55s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 7s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 127m 56s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | JDK v1.7.0_91 Failed junit tests | hadoop.hdfs.server.namenode.TestStartup | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097033#comment-15097033 ] Daniel Templeton commented on HDFS-9415: My concern is that we clearly state that the correct format is 'users groups', but then say that ' * ' is also valid, which doesn't follow that format. I don't see how that can improve clarity. If what you want to say is that trailing spaces are allowed, then say that instead. (It's probably not a bad thing to add in any case.) I will yield on this one. It's not worth arguing over 7 characters. :) > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9094) Add command line option to ask NameNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097143#comment-15097143 ] Hadoop QA commented on HDFS-9094: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s {color} | {color:red} Patch generated 16 new checkstyle issues in hadoop-hdfs-project (total was 307, now 315). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 43s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 7s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 167m 43s {color} | {color:black} {color} | \\ \\ ||
[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9646: Attachment: test-reconstruct-stripe-file.patch Upload the unit test from [~tfukudom] that can reproduce the issue. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097195#comment-15097195 ] Arpit Agarwal commented on HDFS-9415: - bq. If what you want to say is that trailing spaces are allowed, then say that instead. (It's probably not a bad thing to add in any case.) I never talked about trailing spaces. :-) But I see what you are saying now. The Jira font made it easy to miss the surrounding spaces in your comment. I agree it's fair to omit that one. [~xiaobingo], do you want to post an updated patch that removes the {{" * "}} wildcard option? > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Assignee: (was: Wei-Chiu Chuang) > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-9648: - Assignee: Wei-Chiu Chuang > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Description: I saw the Jenkins log shows TestStartup.testImageChecksum has been failing consecutively 5 times. https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ Seems like HDFS-9569 by Yongjun changed exception message, and this test was looking for the exact message. Expected to find 'Failed to load an FSImage file!' but got unexpected exception:java.io.IOException: Failed to load FSImage file, see error(s) above for more info. was: I saw the Jenkins log shows TestStartup.testImageChecksum has been failing consecutively 5 times. https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ Seems like HDFS-9569 by Yongjun changed exception message, and this test was looking for the exact message. > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Attachment: HDFS-9648.001.patch Rev01: match the new exception message. > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files
[ https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097645#comment-15097645 ] Tsz Wo Nicholas Sze commented on HDFS-8430: --- [~drankye], sorry for the late reply. Your suggestion sounds good in general. Some minor comments: > First, add a new API like getFileChecksum(int cell) using the New Algorithm > 2. ... It is better to add the new API as getFileChecksum(String algorithm) since it is more general and more in sync with the Java API such as MessageDigest. We don't want to change/modify the FileSystem API further if we want to support different algorithms in the future. We may need another FileSystem API supportFileChecksum(String algorithm) for distcp or other tools to check if a particular algorithm is supported; see below. > distcp will be updated to favor the new APIs and use the two APIs > appropriately. ... distcp probably needs to first check if the same algorithm supported in both the source and the destination clusters. If they don't support the same algorithm, it may fall back to use file length. Thanks a lot! > Erasure coding: compute file checksum for stripe files > -- > > Key: HDFS-8430 > URL: https://issues.apache.org/jira/browse/HDFS-8430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Kai Zheng > Attachments: HDFS-8430-poc1.patch > > > HADOOP-3981 introduces a distributed file checksum algorithm. It's designed > for replicated block. > {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped > block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.
[ https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097717#comment-15097717 ] Hadoop QA commented on HDFS-8999: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 38s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s {color} | {color:red} Patch generated 7 new checkstyle issues in hadoop-hdfs-project (total was 1043, now 1044). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 16s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 18s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 180m 40s {color} | {color:black} {color} | \\ \\ ||
[jira] [Assigned] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-9648: - Assignee: Wei-Chiu Chuang > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Labels: test (was: ) > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Labels: test > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Priority: Trivial (was: Major) > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Affects Version/s: 3.0.0 > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Status: Patch Available (was: Open) > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing
[ https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9648: -- Component/s: namenode > Test TestStartup.testImageChecksum keeps failing > - > > Key: HDFS-9648 > URL: https://issues.apache.org/jira/browse/HDFS-9648 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Labels: test > Attachments: HDFS-9648.001.patch > > > I saw the Jenkins log shows TestStartup.testImageChecksum has been failing > consecutively 5 times. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/ > Seems like HDFS-9569 by Yongjun changed exception message, and this test was > looking for the exact message. > Expected to find 'Failed to load an FSImage file!' but got unexpected > exception:java.io.IOException: Failed to load FSImage file, see error(s) > above for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files
[ https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097692#comment-15097692 ] Kai Zheng commented on HDFS-8430: - Thanks Nicholas for the great elaborating and confirming. I thought the comments have resolved all my concerns and questions so far. Will surely proceed sooner and wish you a nice day! > Erasure coding: compute file checksum for stripe files > -- > > Key: HDFS-8430 > URL: https://issues.apache.org/jira/browse/HDFS-8430 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Kai Zheng > Attachments: HDFS-8430-poc1.patch > > > HADOOP-3981 introduces a distributed file checksum algorithm. It's designed > for replicated block. > {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped > block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.
[ https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9612: -- Attachment: HDFS-9612.007.patch Rev07: Thanks [~yzhangal] for comments. I uploaded a patch that address most issues. The change of slf4j was used in conjunction of GenericTestUtils.setLogLevel to set log level to DEBUG. GenericTestUtils.setLogLevel is a useful tool, but unfortunately requires slf4j. It is not a necessary part of the fix, so I removed them. About the tests, they use GenericTestUtils.waitForThreadTermination() which periodically checks to see if there are any threads whose name matches the pattern "pool-.*thread.*" (it's regular expression). These are the threads created by ExecutorService. If the fix works, those threads should terminate right away after ProducerConsumer.shutdown() is called. > DistCp worker threads are not terminated after jobs are done. > - > > Key: HDFS-9612 > URL: https://issues.apache.org/jira/browse/HDFS-9612 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.8.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch, > HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch, > HDFS-9612.006.patch, HDFS-9612.007.patch > > > In HADOOP-11827, a producer-consumer style thread pool was introduced to > parallelize the task of listing files/directories. > We have a use case where a distcp job is run during the commit phase of a MR2 > job. However, it was found distcp does not terminate ProducerConsumer thread > pools properly. Because threads are not terminated, those MR2 jobs never > finish. > In a more typical use case where distcp is run as a standalone job, those > threads are terminated forcefully when the java process is terminated. So > these leaked threads did not become a problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup
[ https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097024#comment-15097024 ] Arpit Agarwal commented on HDFS-9415: - Also IME customers routinely skip the trailing space. {code} dfs.cluster.administrators hdfs {code} I plan to commit this patch later today. > Document dfs.cluster.administrators and dfs.permissions.superusergroup > -- > > Key: HDFS-9415 > URL: https://issues.apache.org/jira/browse/HDFS-9415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Arpit Agarwal >Assignee: Xiaobing Zhou > Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, > HDFS-9415.003.patch > > > dfs.cluster.administrators and dfs.permissions.superusergroup documentation > is not clear enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
[ https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097201#comment-15097201 ] Wei-Chiu Chuang commented on HDFS-9466: --- [~cmccabe] Xiao is right about what I thought. It does appear there is a race. From your perspective, do you think that's by design, or some unintended bugs in the code? Thanks for the reviews! > TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky > > > Key: HDFS-9466 > URL: https://issues.apache.org/jira/browse/HDFS-9466 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, hdfs-client >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch > > > This test is flaky and fails quite frequently in trunk. > Error Message > expected:<1> but was:<2> > Stacktrace > {noformat} > java.lang.AssertionError: expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684) > {noformat} > Thanks to [~xiaochen] for identifying the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097200#comment-15097200 ] Jing Zhao commented on HDFS-9646: - The failure can also be reproduced with the following change on {{TestRecoverStripedFile}}: {code} --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java @@ -212,7 +212,7 @@ private void assertFileBlocksRecovery(String fileName, int fileLen, int[] toDead = new int[toRecoverBlockNum]; int n = 0; -for (int i = 0; i < indices.length; i++) { +for (int i = indices.length - 1; i >= 0; i--) { if (n < toRecoverBlockNum) { if (recovery == 0) { if (indices[i] >= dataBlkNum) { {code} > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097214#comment-15097214 ] Jing Zhao edited comment on HDFS-9646 at 1/13/16 10:59 PM: --- {{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the first internal block to decide whether to continue the recovery work: {code} long firstStripedBlockLength = getBlockLen(blockGroup, 0); while (positionInBlock < firstStripedBlockLength) { {code} However, if we are recovering a block whose length is less than the first one (e.g., the last stripe like the following), we will run into an unnecessary iteration which generates decoded result filled with 0. | b0 | b1 | b2 | b3 | b4 | b5 | p0 | p1 | p2 | | 64k | 64k | 64k | 64k | | | 64k | 64k | 64k | Then at the end of {{recoverTargets}}, we set the limit of the decoding output buffer based on the length of the block-to-be-recovered: {code} long blockLen = getBlockLen(blockGroup, targetIndices[i]); long remaining = blockLen - positionInBlock; if (remaining < 0) { targetBuffers[i].limit(0); } else if (remaining < toRecoverLen) { targetBuffers[i].limit((int)remaining); } {code} This will set the buffer limit to 0, and cause {{transferData2Targets}} to return 0. was (Author: jingzhao): {{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the first internal block to decide whether to continue the recovery work: {code} long firstStripedBlockLength = getBlockLen(blockGroup, 0); while (positionInBlock < firstStripedBlockLength) { {code} However, if we are recovering a block whose length is less than the first one, we will run into an unnecessary iteration which generates decoded result filled with 0. Then at the end of {{recoverTargets}}, we set the limit of the decoding output buffer based on the length of the block-to-be-recovered: {code} long blockLen = getBlockLen(blockGroup, targetIndices[i]); long remaining = blockLen - positionInBlock; if (remaining < 0) { targetBuffers[i].limit(0); } else if (remaining < toRecoverLen) { targetBuffers[i].limit((int)remaining); } {code} This will set the buffer limit to 0, and cause {{transferData2Targets}} to return 0. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9646: Attachment: HDFS-9646.000.patch Upload a patch to fix the recover length calculation in ErasureCodingWorker. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
Jing Zhao created HDFS-9646: --- Summary: ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block Key: HDFS-9646 URL: https://issues.apache.org/jira/browse/HDFS-9646 Project: Hadoop HDFS Issue Type: Sub-task Components: erasure-coding Affects Versions: 3.0.0 Reporter: Takuya Fukudome Assignee: Jing Zhao Priority: Critical This is reported by [~tfukudom]: ErasureCodingWorker may fail with the following exception when recovering a non-full internal block. {code} 2016-01-06 11:14:44,740 WARN datanode.DataNode (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 54322288_29751 java.io.IOException: Transfer failed for all targets. at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.
[ https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097207#comment-15097207 ] Yongjun Zhang commented on HDFS-9612: - Thanks for [~jojochuang]'s work here and [~3opan] for the review. Overall the patch looks good. I have the following comments: # About the following code {code} executor.shutdown(); executor.shutdownNow(); {code} it looks a bit werid to me. Instead of calling two methods, why not just call {{ executor.shutdownNow();}}? # Agree with Zoran that separating the log4j change to a different jira would be better. # About {code} try { work = inputQueue.take(); } catch (InterruptedException e) { LOG.debug("Interrupted while waiting for request from inputQueue."); // if interrupt is triggered by shutdown(), terminate the thread // otherwise, attempt to take again Thread.currentThread().interrupt(); return; } boolean isDone = false; while (!isDone) { try { // assume processor.processItem() is stateless WorkReport result = processor.processItem(work); outputQueue.put(result); isDone = true; } catch (InterruptedException ie) { LOG.debug("Could not put report into outputQueue. Retrying..."); } } {code} ## The call to {{Thread.currentThread().interrupt();}} can be dropped ## If I understand it correctly, the comment "if interrupt is triggered by shutdown(), terminate the thread; otherwise, attempt to take again" can be improved. such as "If interrupt happens when taking work out from queue, then the interrupt is likely triggered by the shutdown() call, exit the thread; if the interrupt happens while the work is being processed, go back to process the same work again." ## The message ""Could not put report into outputQueue" is not accurate since interrupt can be triggered from either within processItem or put operation. ## Add javadoc to this method and probably even the class itself to say that it assumes " processor.processItem() is stateless" # About the test, would you please put some comment to indicate how the test would fail and with the fix it won't fail? Thanks. > DistCp worker threads are not terminated after jobs are done. > - > > Key: HDFS-9612 > URL: https://issues.apache.org/jira/browse/HDFS-9612 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.8.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch, > HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch, > HDFS-9612.006.patch > > > In HADOOP-11827, a producer-consumer style thread pool was introduced to > parallelize the task of listing files/directories. > We have a use case where a distcp job is run during the commit phase of a MR2 > job. However, it was found distcp does not terminate ProducerConsumer thread > pools properly. Because threads are not terminated, those MR2 jobs never > finish. > In a more typical use case where distcp is run as a standalone job, those > threads are terminated forcefully when the java process is terminated. So > these leaked threads did not become a problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097238#comment-15097238 ] James Clampffer commented on HDFS-9628: --- Looks good to me, +1. > libhdfs++: Implement builder apis from C bindings > - > > Key: HDFS-9628 > URL: https://issues.apache.org/jira/browse/HDFS-9628 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > Attachments: HDFS-9628.HDFS-8707.000.patch, > HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, > HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, > HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, > HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, > HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, > HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch, > HDFS-9628.HDFS-8707.010.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock
[ https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HDFS-9289: -- Fix Version/s: (was: 2.7.3) (was: 3.0.0) 2.7.2 > Make DataStreamer#block thread safe and verify genStamp in commitBlock > -- > > Key: HDFS-9289 > URL: https://issues.apache.org/jira/browse/HDFS-9289 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Chang Li >Assignee: Chang Li >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: HDFS-9289-branch-2.6.patch, HDFS-9289.1.patch, > HDFS-9289.2.patch, HDFS-9289.3.patch, HDFS-9289.4.patch, HDFS-9289.5.patch, > HDFS-9289.6.patch, HDFS-9289.7.patch, HDFS-9289.branch-2.7.patch, > HDFS-9289.branch-2.patch > > > we have seen a case of corrupt block which is caused by file complete after a > pipelineUpdate, but the file complete with the old block genStamp. This > caused the replicas of two datanodes in updated pipeline to be viewed as > corrupte. Propose to check genstamp when commit block -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9517) Make TestDistCpUtils.testUnpackAttributes testable
[ https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097346#comment-15097346 ] Colin Patrick McCabe commented on HDFS-9517: +1. Thanks, [~jojochuang]. > Make TestDistCpUtils.testUnpackAttributes testable > -- > > Key: HDFS-9517 > URL: https://issues.apache.org/jira/browse/HDFS-9517 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Attachments: HDFS-9517.001.patch > > > testUnpackAttributes() test method in TestDistCpUtils does not have @Test > annotation and is not testable. > I searched around and saw no discussion it was omitted, so I assume it was > just unintentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
[ https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9517: --- Summary: Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes (was: Make TestDistCpUtils.testUnpackAttributes testable) > Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes > > > Key: HDFS-9517 > URL: https://issues.apache.org/jira/browse/HDFS-9517 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Attachments: HDFS-9517.001.patch > > > testUnpackAttributes() test method in TestDistCpUtils does not have @Test > annotation and is not testable. > I searched around and saw no discussion it was omitted, so I assume it was > just unintentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8615) Correct HTTP method in WebHDFS document
[ https://issues.apache.org/jira/browse/HDFS-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HDFS-8615: -- Fix Version/s: (was: 2.7.3) (was: 2.8.0) 2.7.2 Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing fix-versions to reflect the same. > Correct HTTP method in WebHDFS document > --- > > Key: HDFS-8615 > URL: https://issues.apache.org/jira/browse/HDFS-8615 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.4.1 >Reporter: Akira AJISAKA >Assignee: Brahma Reddy Battula > Labels: newbie > Fix For: 2.7.2, 2.6.3 > > Attachments: HDFS-8615.branch-2.6.patch, HDFS-8615.patch > > > For example, {{-X PUT}} should be removed from the following curl command. > {code:title=WebHDFS.md} > ### Get ACL Status > * Submit a HTTP GET request. > curl -i -X PUT > "http://:/webhdfs/v1/?op=GETACLSTATUS" > {code} > Other than this example, there are several commands which {{-X PUT}} should > be removed from. We should fix them all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9597) BaseReplicationPolicyTest should update data node stats after adding a data node
[ https://issues.apache.org/jira/browse/HDFS-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097351#comment-15097351 ] Vinod Kumar Vavilapalli commented on HDFS-9597: --- [~benoyantony], there is a branch-2.8 where you need to land this patch for it to be in 2.8.0. > BaseReplicationPolicyTest should update data node stats after adding a data > node > > > Key: HDFS-9597 > URL: https://issues.apache.org/jira/browse/HDFS-9597 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.0.0, 2.8.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Blocker > Fix For: 2.8.0 > > Attachments: HDFS-9597.001.patch > > > Looks like HDFS-9034 broke > TestReplicationPolicyConsiderLoad#testChooseTargetWithDecomNodes. > This test has been failing since yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
[ https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097350#comment-15097350 ] Colin Patrick McCabe commented on HDFS-9466: Hmm. Can you be clearer on what the race condition is here? > TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky > > > Key: HDFS-9466 > URL: https://issues.apache.org/jira/browse/HDFS-9466 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, hdfs-client >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch > > > This test is flaky and fails quite frequently in trunk. > Error Message > expected:<1> but was:<2> > Stacktrace > {noformat} > java.lang.AssertionError: expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684) > {noformat} > Thanks to [~xiaochen] for identifying the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097348#comment-15097348 ] Vinod Kumar Vavilapalli commented on HDFS-9493: --- [~eddyxu], there is a branch-2.8 where you need to land this patch for it to make to 2.8.0. > Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk > --- > > Key: HDFS-9493 > URL: https://issues.apache.org/jira/browse/HDFS-9493 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Mingliang Liu >Assignee: Tony Wu > Fix For: 2.8.0 > > Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch, > HDFS-9493.003.patch > > > Tested in both Gentoo Linux and Mac. > {quote} > --- > T E S T S > --- > Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave > Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave > testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave) > Time elapsed: 15.318 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9431) DistributedFileSystem#concat fails if the target path is relative.
[ https://issues.apache.org/jira/browse/HDFS-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HDFS-9431: -- Fix Version/s: (was: 2.7.3) (was: 2.8.0) 2.7.2 Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing fix-versions to reflect the same. > DistributedFileSystem#concat fails if the target path is relative. > -- > > Key: HDFS-9431 > URL: https://issues.apache.org/jira/browse/HDFS-9431 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Kazuho Fujii >Assignee: Kazuho Fujii > Fix For: 2.7.2, 2.6.3 > > Attachments: HDFS-9431.001.patch, HDFS-9431.002.patch > > > {{DistributedFileSystem#concat}} fails if the target path is relative. > The method tries to send a relative path to DFSClient at the first call. > bq. dfs.concat(getPathName(trg), srcsStr); > But, {{getPathName}} failed. It seems that {{trg}} should be {{absF}} like > the second call. > bq. dfs.concat(getPathName(absF), srcsStr); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9647) DiskBalancer : Add getRuntimeSettings
[ https://issues.apache.org/jira/browse/HDFS-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9647: --- Attachment: HDFS-9647-HDFS-1312.001.patch This patch depends on HDFS-9645. Attaching patch for code review purpose. > DiskBalancer : Add getRuntimeSettings > - > > Key: HDFS-9647 > URL: https://issues.apache.org/jira/browse/HDFS-9647 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-1312 > > Attachments: HDFS-9647-HDFS-1312.001.patch > > > Adds an RPC to read the runtime values of disk balancer like disk bandwidth. > This is similar to getdiskbandwidth used by balancer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9646: Status: Patch Available (was: Open) > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock
[ https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097181#comment-15097181 ] Vinod Kumar Vavilapalli commented on HDFS-9289: --- Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing fix-versions to reflect the same. > Make DataStreamer#block thread safe and verify genStamp in commitBlock > -- > > Key: HDFS-9289 > URL: https://issues.apache.org/jira/browse/HDFS-9289 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Chang Li >Assignee: Chang Li >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: HDFS-9289-branch-2.6.patch, HDFS-9289.1.patch, > HDFS-9289.2.patch, HDFS-9289.3.patch, HDFS-9289.4.patch, HDFS-9289.5.patch, > HDFS-9289.6.patch, HDFS-9289.7.patch, HDFS-9289.branch-2.7.patch, > HDFS-9289.branch-2.patch > > > we have seen a case of corrupt block which is caused by file complete after a > pipelineUpdate, but the file complete with the old block genStamp. This > caused the replicas of two datanodes in updated pipeline to be viewed as > corrupte. Propose to check genstamp when commit block -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9631) Restarting namenode after deleting a directory with snapshot will fail
[ https://issues.apache.org/jira/browse/HDFS-9631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097203#comment-15097203 ] Wei-Chiu Chuang commented on HDFS-9631: --- I know [~yzhangal] has hit a similar issue in production. Maybe this test failure will be fixed after Yongjun finds a solution. > Restarting namenode after deleting a directory with snapshot will fail > -- > > Key: HDFS-9631 > URL: https://issues.apache.org/jira/browse/HDFS-9631 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > > I found a number of {{TestOpenFilesWithSnapshot}} tests failed quite > frequently. > {noformat} > FAILED: > org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testParentDirWithUCFileDeleteWithSnapShot > Error Message: > Timed out waiting for Mini HDFS Cluster to start > Stack Trace: > java.io.IOException: Timed out waiting for Mini HDFS Cluster to start > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1345) > at > org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2024) > at > org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1985) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testParentDirWithUCFileDeleteWithSnapShot(TestOpenFilesWithSnapshot.java:82) > {noformat} > These tests ({{testParentDirWithUCFileDeleteWithSnapshot}}, > {{testOpenFilesWithRename}}, {{testWithCheckpoint}}) are unable to reconnect > to the namenode after restart. It looks like the reconnection failed due to > an EOFException when BPServiceActor sends a heartbeat. > {noformat} > 2016-01-07 23:25:43,678 [main] WARN hdfs.MiniDFSCluster > (MiniDFSCluster.java:waitClusterUp(1338)) - Waiting for the Mini HDFS Cluster > to start... > 2016-01-07 23:25:44,679 [main] WARN hdfs.MiniDFSCluster > (MiniDFSCluster.java:waitClusterUp(1338)) - Waiting for the Mini HDFS Cluster > to start... > 2016-01-07 23:25:44,720 [DataNode: > [[[DISK]file:/home/weichiu/hadoop2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/, > [DISK]file: > /home/weichiu/hadoop2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]] > heartbeating to localhost/127.0.0.1:60472] WARN datanode > .DataNode (BPServiceActor.java:offerService(752)) - IOException in > offerService > java.io.EOFException: End of File Exception between local host is: > "weichiu.vpc.cloudera.com/172.28.211.219"; destination host is: > "localhost":6047 > 2; :; For more details see: http://wiki.apache.org/hadoop/EOFException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:793) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:766) > at org.apache.hadoop.ipc.Client.call(Client.java:1452) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy18.sendHeartbeat(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:154) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:557) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:660) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:851) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1110) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1005) > {noformat} > It appears that these three tests all call {{doWriteAndAbort()}}, which > creates files and then abort, and then set the parent directory with a > snapshot, and then delete the parent directory. > Interestingly, if the parent directory does not have a snapshot, the tests > will not fail. Additionally, if the parent directory is not deleted, the > tests will not fail. > The following test will fail intermittently: > {code:java} > public void testDeleteParentDirWithSnapShot() throws
[jira] [Created] (HDFS-9647) DiskBalancer : Add getRuntimeSettings
Anu Engineer created HDFS-9647: -- Summary: DiskBalancer : Add getRuntimeSettings Key: HDFS-9647 URL: https://issues.apache.org/jira/browse/HDFS-9647 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer & mover Affects Versions: HDFS-1312 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: HDFS-1312 Adds an RPC to read the runtime values of disk balancer like disk bandwidth. This is similar to getdiskbandwidth used by balancer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097214#comment-15097214 ] Jing Zhao commented on HDFS-9646: - {{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the first internal block to decide whether to continue the recovery work: {code} long firstStripedBlockLength = getBlockLen(blockGroup, 0); while (positionInBlock < firstStripedBlockLength) { {code} However, if we are recovering a block whose length is less than the first one, we will run into an unnecessary iteration which generates decoded result filled with 0. Then at the end of {{recoverTargets}}, we set the limit of the decoding output buffer based on the length of the block-to-be-recovered: {code} long blockLen = getBlockLen(blockGroup, targetIndices[i]); long remaining = blockLen - positionInBlock; if (remaining < 0) { targetBuffers[i].limit(0); } else if (remaining < toRecoverLen) { targetBuffers[i].limit((int)remaining); } {code} This will set the buffer limit to 0, and cause {{transferData2Targets}} to return 0. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.
[ https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097320#comment-15097320 ] Hadoop QA commented on HDFS-8999: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red} Patch generated 7 new checkstyle issues in hadoop-hdfs-project (total was 1043, now 1044). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 169m 26s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 168m 56s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 381m 34s {color} |
[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097368#comment-15097368 ] Lei (Eddy) Xu commented on HDFS-9493: - [~vinodkv] Thanks for reminding me! Cherry picked it into {{branch-2.8}} now. > Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk > --- > > Key: HDFS-9493 > URL: https://issues.apache.org/jira/browse/HDFS-9493 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Mingliang Liu >Assignee: Tony Wu > Fix For: 2.8.0 > > Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch, > HDFS-9493.003.patch > > > Tested in both Gentoo Linux and Mac. > {quote} > --- > T E S T S > --- > Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave > Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave > testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave) > Time elapsed: 15.318 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097443#comment-15097443 ] Kai Zheng commented on HDFS-9646: - Hi [~jingzhao], The patch looks great! I'm reading it and the related codes. So far I have a question: probably the current codes think {{maxTargetLength}} in your sense is right the length of the first block in the group, aka {{firstStripedBlockLength = getBlockLen(blockGroup, 0)}}. If so, I thought the thinking would be correct. Maybe {{getBlockLen}} doesn't return the exact length of the first block as someone may think it should? > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097523#comment-15097523 ] Kai Zheng commented on HDFS-9646: - The patch is a great fix along with good refactorings. Some comments: 1. It's good to refactor and avoid duplicate codes and computing around {{getReadLength}}. Minors: 1) {{positionInBlock}} would be good to be explicitly initialized to 0 in the beginning of the {{run}} method; 2) {{toRecover}} better to use the original name {{toRecoverLen}}; 3) {{success}} could be {{successList}}. 2. In the test, introducing {{RecoveryType}} is nice. Suggest: change {{Any}} to {{Both}}, and the logic for it can be, generate dead blocks of both data ones and parity ones, thus the test would be much thorough. A minor: {{toDead}} could be {{toDie}}. 3. Question: do we need new test codes to expose the issue and ensure the issue is fixed? I'm not sure about this, because existing tests have already all sorts of file lengths, maybe lacking the right one for the reported case as you described above (the max length of the targeted blocks should be smaller than the first block). > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097538#comment-15097538 ] Hadoop QA commented on HDFS-9646: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 52s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 27s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 126m 32s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles | | JDK v1.7.0_91 Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.TestFSInputChecker | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.server.namenode.TestStartup | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Updated] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
[ https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9517: --- Resolution: Fixed Fix Version/s: 2.9.0 Target Version/s: 2.9.0 Status: Resolved (was: Patch Available) > Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes > > > Key: HDFS-9517 > URL: https://issues.apache.org/jira/browse/HDFS-9517 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Fix For: 2.9.0 > > Attachments: HDFS-9517.001.patch > > > testUnpackAttributes() test method in TestDistCpUtils does not have @Test > annotation and is not testable. > I searched around and saw no discussion it was omitted, so I assume it was > just unintentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097447#comment-15097447 ] Kai Zheng commented on HDFS-9646: - Oh, I got your point. You want to only read and recover the max length of {{target}} blocks to recover. This sounds a good optimization in addition to the fix. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9595) DiskBalancer : Add cancelPlan RPC
[ https://issues.apache.org/jira/browse/HDFS-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097454#comment-15097454 ] Arpit Agarwal commented on HDFS-9595: - The patch looks great. Nitpick typo - _Cancels and executing disk balancer plan_ should be _Cancels an executing disk balancer plan_. +1 otherwise. > DiskBalancer : Add cancelPlan RPC > - > > Key: HDFS-9595 > URL: https://issues.apache.org/jira/browse/HDFS-9595 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-1312 > > Attachments: HDFS-9595-HDFS-1312.001.patch > > > Add an RPC that allows users to cancel a running disk balancer plan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.
[ https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8999: -- Attachment: h8999_20160114.patch Oops, the index calculated was incorrect in the last patch. h8999_20160114.patch: fixes the bug. > Namenode need not wait for {{blockReceived}} for the last block before > completing a file. > - > > Key: HDFS-8999 > URL: https://issues.apache.org/jira/browse/HDFS-8999 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jitendra Nath Pandey >Assignee: Tsz Wo Nicholas Sze > Attachments: h8999_20151228.patch, h8999_20160106.patch, > h8999_20160106b.patch, h8999_20160106c.patch, h8999_20160111.patch, > h8999_20160113.patch, h8999_20160114.patch > > > This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment > from the jira: > {quote} > ...whether we need to let NameNode wait for all the block_received msgs to > announce the replica is safe. Looking into the code, now we have ># NameNode knows the DataNodes involved when initially setting up the > writing pipeline ># If any DataNode fails during the writing, client bumps the GS and > finally reports all the DataNodes included in the new pipeline to NameNode > through the updatePipeline RPC. ># When the client received the ack for the last packet of the block (and > before the client tries to close the file on NameNode), the replica has been > finalized in all the DataNodes. > Then in this case, when NameNode receives the close request from the client, > the NameNode already knows the latest replicas for the block. Currently the > checkReplication call only counts in all the replicas that NN has already > received the block_received msg, but based on the above #2 and #3, it may be > safe to also count in all the replicas in the > BlockUnderConstructionFeature#replicas? > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097511#comment-15097511 ] Tsz Wo Nicholas Sze commented on HDFS-9646: --- Thanks [~tfukudom] and [~jingzhao]. Patch looks good. Just a minor comment. - getReadLength can safely return int since it returns the min of remaining and recoverLength, where recoverLength is an int. > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9624) DataNode start slowly due to the initial DU command operations
[ https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9624: Attachment: HDFS-9624.006.patch Sorry for Wang, I misunderstand your meaning of timer injection. I use the fakeTimer to the testcase and the test finished in milliseconds instead of waitting several seconds.Update the latest patch. > DataNode start slowly due to the initial DU command operations > -- > > Key: HDFS-9624 > URL: https://issues.apache.org/jira/browse/HDFS-9624 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, > HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch, > HDFS-9624.006.patch > > > It seems starting datanode so slowly when I am finishing migration of > datanodes and restart them.I look the dn logs: > {code} > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4 > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK > 2016-01-06 16:05:08,176 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Registered FSDatasetState MBean > 2016-01-06 16:05:08,177 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 > 2016-01-06 16:05:08,178 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data2/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data3/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data4/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data5/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data6/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data7/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data8/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data9/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data10/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data11/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data12/current... > 2016-01-06 16:09:49,646 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data7/current: 281466ms > 2016-01-06 16:09:54,235 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data9/current: 286054ms > 2016-01-06 16:09:57,859 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data2/current: 289680ms > 2016-01-06
[jira] [Commented] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
[ https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097406#comment-15097406 ] Hudson commented on HDFS-9517: -- FAILURE: Integrated in Hadoop-trunk-Commit #9104 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9104/]) HDFS-9517. Fix missing @Test annotation on (cmccabe: rev 8315582c4ff2951144b096c23a64e753f397572d) * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java * hadoop-common-project/hadoop-common/CHANGES.txt > Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes > > > Key: HDFS-9517 > URL: https://issues.apache.org/jira/browse/HDFS-9517 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Fix For: 2.9.0 > > Attachments: HDFS-9517.001.patch > > > testUnpackAttributes() test method in TestDistCpUtils does not have @Test > annotation and is not testable. > I searched around and saw no discussion it was omitted, so I assume it was > just unintentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097502#comment-15097502 ] Kai Sasaki commented on HDFS-9646: -- Hello, [~jingzhao] BTW, the patch seems to include the fix of HDFS-9585. Do you thing we can make HDFS-9585 close after fixing this JIRA? https://issues.apache.org/jira/browse/HDFS-9585 > ErasureCodingWorker may fail when recovering data blocks with length less > than the first internal block > --- > > Key: HDFS-9646 > URL: https://issues.apache.org/jira/browse/HDFS-9646 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Takuya Fukudome >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch > > > This is reported by [~tfukudom]: ErasureCodingWorker may fail with the > following exception when recovering a non-full internal block. > {code} > 2016-01-06 11:14:44,740 WARN datanode.DataNode > (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: > BP-987302662-172.29.4.13-1450757377698:blk_-92233720368 > 54322288_29751 > java.io.IOException: Transfer failed for all targets. > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load
[ https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097541#comment-15097541 ] Kai Zheng commented on HDFS-9635: - Just in case it can help, regarding volume choosing policy with multiple storage types, there are some optimizations in HDFS-9608. Wonder if we could consolidate all these inputs, thoughts and effort together to come up a comprehensive policy allowing kinds of configuring and tuning. After all, having a few of such policies in the list, users may be hard to choose. > Add one more volume choosing policy with considering volume IO load > --- > > Key: HDFS-9635 > URL: https://issues.apache.org/jira/browse/HDFS-9635 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yong Zhang >Assignee: Yong Zhang > > We have RoundRobinVolumeChoosingPolicy and > AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load. > This jira will add a Add one more volume choosing policy base on how many > xceiver count on volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9624) DataNode start slowly due to the initial DU command operations
[ https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9624: Attachment: HDFS-9624.005.patch Thanks [~andrew.wang] for comments. I found that do timer injection in this test looks not very convenient so I create block files to consume time in the test and update the patch. > DataNode start slowly due to the initial DU command operations > -- > > Key: HDFS-9624 > URL: https://issues.apache.org/jira/browse/HDFS-9624 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, > HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch > > > It seems starting datanode so slowly when I am finishing migration of > datanodes and restart them.I look the dn logs: > {code} > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4 > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK > 2016-01-06 16:05:08,176 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Registered FSDatasetState MBean > 2016-01-06 16:05:08,177 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 > 2016-01-06 16:05:08,178 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data2/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data3/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data4/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data5/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data6/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data7/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data8/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data9/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data10/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data11/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data12/current... > 2016-01-06 16:09:49,646 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data7/current: 281466ms > 2016-01-06 16:09:54,235 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data9/current: 286054ms > 2016-01-06 16:09:57,859 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data2/current: 289680ms > 2016-01-06 16:10:00,333 INFO >
[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Hansen updated HDFS-9628: - Attachment: HDFS-9628.HDFS-8707.009.patch New patch: rebased on latest HDFS-8707 > libhdfs++: Implement builder apis from C bindings > - > > Key: HDFS-9628 > URL: https://issues.apache.org/jira/browse/HDFS-9628 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > Attachments: HDFS-9628.HDFS-8707.000.patch, > HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, > HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, > HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, > HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, > HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, > HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9636) libhdfs++: for consistency, include files should be in hdfspp
[ https://issues.apache.org/jira/browse/HDFS-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9636: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to HDFS-8707. Thanks for the patch Bob! > libhdfs++: for consistency, include files should be in hdfspp > - > > Key: HDFS-9636 > URL: https://issues.apache.org/jira/browse/HDFS-9636 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > Attachments: HDFS-9636.HDFS-8707.000.patch, > HDFS-9636.HDFS-8707.001.patch, HDFS-9636.HDFS-8707.001.patch > > > The existing hdfs library resides in hdfs/hdfs.h. To maintain Least > Astonishment, we should move the libhdfspp files into hdfspp/hdfspp.h > (they're currently in the libhdfspp/ directory). > Likewise, the install step in the root directory should put the include files > in /include/hdfspp and include/hdfs (it currently erroneously puts the hdfs > file into libhdfs/) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations
[ https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096432#comment-15096432 ] Hadoop QA commented on HDFS-9624: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} HDFS-9624 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782076/HDFS-9624.005.patch | | JIRA Issue | HDFS-9624 | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14110/console | This message was automatically generated. > DataNode start slowly due to the initial DU command operations > -- > > Key: HDFS-9624 > URL: https://issues.apache.org/jira/browse/HDFS-9624 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, > HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch > > > It seems starting datanode so slowly when I am finishing migration of > datanodes and restart them.I look the dn logs: > {code} > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4 > 2016-01-06 16:05:08,118 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added > volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK > 2016-01-06 16:05:08,176 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Registered FSDatasetState MBean > 2016-01-06 16:05:08,177 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 > 2016-01-06 16:05:08,178 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data2/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data3/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data4/current... > 2016-01-06 16:05:08,179 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data5/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data6/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data7/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data8/current... > 2016-01-06 16:05:08,180 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data9/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data10/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data11/current... > 2016-01-06 16:05:08,181 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume > /home/data/data/hadoop/dfs/data/data12/current... > 2016-01-06 16:09:49,646 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on > /home/data/data/hadoop/dfs/data/data7/current: 281466ms > 2016-01-06
[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Hansen updated HDFS-9628: - Attachment: HDFS-9628.HDFS-8707.008.patch New patch: fixed hdfs_builder_test main function > libhdfs++: Implement builder apis from C bindings > - > > Key: HDFS-9628 > URL: https://issues.apache.org/jira/browse/HDFS-9628 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > Attachments: HDFS-9628.HDFS-8707.000.patch, > HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, > HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, > HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, > HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, > HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, > HDFS-9628.HDFS-8707.008.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Hansen updated HDFS-9628: - Attachment: HDFS-9628.HDFS-8707.010.patch New patch: catch up new code with rebase > libhdfs++: Implement builder apis from C bindings > - > > Key: HDFS-9628 > URL: https://issues.apache.org/jira/browse/HDFS-9628 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > Attachments: HDFS-9628.HDFS-8707.000.patch, > HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, > HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, > HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, > HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, > HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, > HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch, > HDFS-9628.HDFS-8707.010.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load
[ https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096240#comment-15096240 ] Yong Zhang commented on HDFS-9635: -- Hi [~andrew.wang], thanks for your comment. BlockPlacementPolicy consider DataNode writing load, but current VolumeCoosingPolicy not consider writing load on disk. we had some customers face some disks IO busy but some free in same DataNode, we want to balance the writing thread on different disk with same storage type. So a DataXceiver monitor and some metrics will be added, and the new VolumeCoosingPolicy will choose the free disk. > Add one more volume choosing policy with considering volume IO load > --- > > Key: HDFS-9635 > URL: https://issues.apache.org/jira/browse/HDFS-9635 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yong Zhang >Assignee: Yong Zhang > > We have RoundRobinVolumeChoosingPolicy and > AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load. > This jira will add a Add one more volume choosing policy base on how many > xceiver count on volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load
[ https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096274#comment-15096274 ] Yong Zhang commented on HDFS-9635: -- As guys discussed in HDFS-8538, especially you mentioned https://issues.apache.org/jira/browse/HDFS-8538?focusedCommentId=14574914=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14574914. IMO, AvailableSpaceVolumeChoosingPolicy is ok for scenario of disk with different free space or capacity. we also use AvailableSpaceVolumeChoosingPolicy by default. By the way, I don't think collect OS level IO statistics is a good idea. Considering heterogeneous mechine write performance depends not only disk IO, but also machine CPU, network bandwith, hadware new/old and so on. So I think we can collect data writing delay metrics for both BlockPlacementPolicy and VolumeCoosingPolicy in further work. It is usefull for multi-tenant cluster. > Add one more volume choosing policy with considering volume IO load > --- > > Key: HDFS-9635 > URL: https://issues.apache.org/jira/browse/HDFS-9635 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yong Zhang >Assignee: Yong Zhang > > We have RoundRobinVolumeChoosingPolicy and > AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load. > This jira will add a Add one more volume choosing policy base on how many > xceiver count on volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096553#comment-15096553 ] Hadoop QA commented on HDFS-9628: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 41s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 0s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 5s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 48s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 46s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 38s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782084/HDFS-9628.HDFS-8707.010.patch | | JIRA Issue | HDFS-9628 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml cc | | uname | Linux d520cc98d10e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 5276e19 | | Default Java | 1.7.0_91 | | Multi-JDK versions |
[jira] [Commented] (HDFS-9047) Retire libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096566#comment-15096566 ] Kihwal Lee commented on HDFS-9047: -- Also fixed the BUILDING.txt in trunk, branch-2 and branch-2.8 that was missed in the original commit. > Retire libwebhdfs > - > > Key: HDFS-9047 > URL: https://issues.apache.org/jira/browse/HDFS-9047 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Allen Wittenauer >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch > > > This library is basically a mess: > * It's not part of the mvn package > * It's missing functionality and barely maintained > * It's not in the precommit runs so doesn't get exercised regularly > * It's not part of the unit tests (at least, that I can see) > * It isn't documented in any official documentation > But most importantly: > * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open) > Let's cut our losses and just remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-3917) Remove JNI code from libwebhdfs (C client library)
[ https://issues.apache.org/jira/browse/HDFS-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-3917. -- Resolution: Not A Problem libwebhdfs has been removed. > Remove JNI code from libwebhdfs (C client library) > -- > > Key: HDFS-3917 > URL: https://issues.apache.org/jira/browse/HDFS-3917 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > > The current implementation of libwebhdfs (C client library) uses JNI for > loading NameNode configuration and implementing hdfsCopy/hdfsMove. We need to > implement the same functionalities in libwebhdfs without using JNI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings
[ https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096512#comment-15096512 ] Hadoop QA commented on HDFS-9628: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 41s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 58s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 55s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 2m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 2m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 15s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 19s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 10s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782083/HDFS-9628.HDFS-8707.009.patch | | JIRA Issue | HDFS-9628 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml cc | | uname | Linux 9a46d6b2d7d0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64
[jira] [Commented] (HDFS-9047) Retire libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096557#comment-15096557 ] Kihwal Lee commented on HDFS-9047: -- Removed from branch-2.7. > Retire libwebhdfs > - > > Key: HDFS-9047 > URL: https://issues.apache.org/jira/browse/HDFS-9047 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Allen Wittenauer >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch > > > This library is basically a mess: > * It's not part of the mvn package > * It's missing functionality and barely maintained > * It's not in the precommit runs so doesn't get exercised regularly > * It's not part of the unit tests (at least, that I can see) > * It isn't documented in any official documentation > But most importantly: > * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open) > Let's cut our losses and just remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9047) Retire libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096592#comment-15096592 ] Hudson commented on HDFS-9047: -- FAILURE: Integrated in Hadoop-trunk-Commit #9100 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9100/]) Supplement to HDFS-9047. (kihwal: rev c722b62908984f8fb6ab2e0bfd40c090e8c830c7) * BUILDING.txt > Retire libwebhdfs > - > > Key: HDFS-9047 > URL: https://issues.apache.org/jira/browse/HDFS-9047 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Allen Wittenauer >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch > > > This library is basically a mess: > * It's not part of the mvn package > * It's missing functionality and barely maintained > * It's not in the precommit runs so doesn't get exercised regularly > * It's not part of the unit tests (at least, that I can see) > * It isn't documented in any official documentation > But most importantly: > * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open) > Let's cut our losses and just remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.
[ https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8999: -- Attachment: h8999_20160113.patch h8999_20160113.patch: adds a test and fixes some bugs. > Namenode need not wait for {{blockReceived}} for the last block before > completing a file. > - > > Key: HDFS-8999 > URL: https://issues.apache.org/jira/browse/HDFS-8999 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jitendra Nath Pandey >Assignee: Tsz Wo Nicholas Sze > Attachments: h8999_20151228.patch, h8999_20160106.patch, > h8999_20160106b.patch, h8999_20160106c.patch, h8999_20160111.patch, > h8999_20160113.patch > > > This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment > from the jira: > {quote} > ...whether we need to let NameNode wait for all the block_received msgs to > announce the replica is safe. Looking into the code, now we have ># NameNode knows the DataNodes involved when initially setting up the > writing pipeline ># If any DataNode fails during the writing, client bumps the GS and > finally reports all the DataNodes included in the new pipeline to NameNode > through the updatePipeline RPC. ># When the client received the ack for the last packet of the block (and > before the client tries to close the file on NameNode), the replica has been > finalized in all the DataNodes. > Then in this case, when NameNode receives the close request from the client, > the NameNode already knows the latest replicas for the block. Currently the > checkReplication call only counts in all the replicas that NN has already > received the block_received msg, but based on the above #2 and #3, it may be > safe to also count in all the replicas in the > BlockUnderConstructionFeature#replicas? > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)