[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979740#comment-15979740 ] Hudson commented on HDFS-11402: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11623 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11623/]) HDFS-11402. HDFS Snapshots should capture point-in-time copies of OPEN (yzhang: rev 20e3ae260b40cd6ef657b2a629a02219d68f162f) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDiffReport.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with
[jira] [Comment Edited] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979738#comment-15979738 ] Yongjun Zhang edited comment on HDFS-11402 at 4/22/17 4:05 AM: --- Committed to trunk. Thanks [~manojg] for the very good contribution! And thanks [~jingzhao] [~linyiqun] and [~andrew.wang] for the review! was (Author: yzhangal): Committed to trunk. Thanks [~manojg] for the very good contribution! And thanks [~jingzhao] [~andrew.wang] for the view! > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979739#comment-15979739 ] Yongjun Zhang commented on HDFS-11402: -- We might consider backport to branch-2. The jira is marked incompatible, but the new behaviour is disabled by default. It should be safe to put into branch-2 so people can enable it when they need it. BTW, noticed a small thing {quote} If true, snapshots taken will have an immutable shared copy of {quote} The "shared" word in hdfs-default.xml looks a bit confusing. We indeed create a copy and it's not really shared, maybe we can remove this word later. Thanks. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979738#comment-15979738 ] Yongjun Zhang commented on HDFS-11402: -- Committed to trunk. Thanks [~manojg] for the very good contribution! And thanks [~jingzhao] [~andrew.wang] for the view! > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-11402: - Resolution: Fixed Hadoop Flags: Incompatible change,Reviewed (was: Incompatible change) Fix Version/s: 3.0.0-alpha3 Status: Resolved (was: Patch Available) > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979731#comment-15979731 ] Yongjun Zhang commented on HDFS-11402: -- Thanks [~andrew.wang] for the review, I did another round of review, looks good to me too. +1. I'm going to commit soon. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
[ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-11384: --- Attachment: HDFS-11384.006.patch * You are right, the rate of {{getBlocks}} RPCs is not guaranteed. Balancer can only do its best. The actual rate can be only guaranteed on the NameNode, but we don't want to go there. I made it clear in the comment for {{BALANCER_NUM_RPC_PER_SEC}}. * Added a decryption for delay. * It is pretty hard to measure the rate of operations on NN. Here is what I did. Created a spy FSNamesystem. The spy would call a modified {{getBlocks()}} when the corresponding RPC is called. The modified {{getBlocks()}} first calls the original method, then counts the number of calls and the time of the first and the last call to {{getBlocks()}}. Given the number of calls and the interval we can estimate the rate later on. > Add option for balancer to disperse getBlocks calls to avoid NameNode's > rpc.CallQueueLength spike > - > > Key: HDFS-11384 > URL: https://issues.apache.org/jira/browse/HDFS-11384 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.3 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: balancer.day.png, balancer.week.png, > HDFS-11384.001.patch, HDFS-11384.002.patch, HDFS-11384.003.patch, > HDFS-11384.004.patch, HDFS-11384.005.patch, HDFS-11384.006.patch > > > When running balancer on hadoop cluster which have more than 3000 Datanodes > will cause NameNode's rpc.CallQueueLength spike. We observed this situation > could cause Hbase cluster failure due to RegionServer's WAL timeout. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11644) DFSStripedOutputStream should not implement Syncable
[ https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11644: -- Attachment: HDFS-11644.01.patch Attaching the quick-fix v01 patch while I work on the fix as suggested by Steve. [~andrew.wang], can you please take a look ? > DFSStripedOutputStream should not implement Syncable > > > Key: HDFS-11644 > URL: https://issues.apache.org/jira/browse/HDFS-11644 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Manoj Govindassamy >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11644.01.patch > > > FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, > calls hsync. Otherwise, it just calls flush. This is used, for instance, by > YARN's FileSystemTimelineWriter. > DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. > However, DFSStripedOS throws a runtime exception when the Syncable methods > are called. > We should refactor the inheritance structure so DFSStripedOS does not > implement Syncable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports
[ https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979636#comment-15979636 ] Chen Liang commented on HDFS-11493: --- One quick thing, in {{TestContainerReplicationManager}} most tests have @Test above comments, while one test has it after comments. > Ozone: SCM: Add the ability to handle container reports > - > > Key: HDFS-11493 > URL: https://issues.apache.org/jira/browse/HDFS-11493 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: container-replication-storage.pdf, > HDFS-11493-HDFS-7240.001.patch > > > Once a datanode sends the container report it is SCM's responsibility to > determine if the replication levels are acceptable. If it is not, SCM should > initiate a replication request to another datanode. This JIRA tracks how SCM > handles a container report. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory
[ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979635#comment-15979635 ] Wei-Chiu Chuang commented on HDFS-11661: Hey Sean, thanks for the comment. I think there's a lightweight approach to fix the same bug without includedNodes set. I will post a patch next week if I can do something clever. > GetContentSummary uses excessive amounts of memory > -- > > Key: HDFS-11661 > URL: https://issues.apache.org/jira/browse/HDFS-11661 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Nathan Roberts >Priority: Blocker > Attachments: Heap growth.png > > > ContentSummaryComputationContext::nodeIncluded() is being used to keep track > of all INodes visited during the current content summary calculation. This > can be all of the INodes in the filesystem, making for a VERY large hash > table. This simply won't work on large filesystems. > We noticed this after upgrading a namenode with ~100Million filesystem > objects was spending significantly more time in GC. Fortunately this system > had some memory breathing room, other clusters we have will not run with this > additional demand on memory. > This was added as part of HDFS-10797 as a way of keeping track of INodes that > have already been accounted for - to avoid double counting. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979577#comment-15979577 ] Manoj Govindassamy commented on HDFS-11402: --- Above unit test failures not related to the patch. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979574#comment-15979574 ] Hadoop QA commented on HDFS-11402: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 25s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}132m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | Timed out junit tests | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11402 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12864572/HDFS-11402.07.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux ed52ac537083 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Attachment: HDFS-11402.08.patch Thanks for the quick review [~andrew.wang]. Attached v08 patch to address the following. please take a look. bq. We could use a stride increment to simplify the work partitioning logic (and make work distribution more even): Done. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch, HDFS-11402.08.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory
[ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979551#comment-15979551 ] Sean Mackrory commented on HDFS-11661: -- +1 to the revert - I too would still like to see the original problem fixed, but this is worse. It does indeed require global context to do correctly, so it'll require some cleverness to make sure we do that without using tons of space or locking for a long time. [~jojochuang] - to revert cleanly we can revert HDFS-11515 (unless I'm missing something and that patch does more than just correct the original changes in HDFS-10797) first and then HDFS-10797. As [~xiaochen] is not available right now, would you be able to commit the revert when we're satisfied? I'll run tests with the reverts committed locally... > GetContentSummary uses excessive amounts of memory > -- > > Key: HDFS-11661 > URL: https://issues.apache.org/jira/browse/HDFS-11661 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Nathan Roberts >Priority: Blocker > Attachments: Heap growth.png > > > ContentSummaryComputationContext::nodeIncluded() is being used to keep track > of all INodes visited during the current content summary calculation. This > can be all of the INodes in the filesystem, making for a VERY large hash > table. This simply won't work on large filesystems. > We noticed this after upgrading a namenode with ~100Million filesystem > objects was spending significantly more time in GC. Fortunately this system > had some memory breathing room, other clusters we have will not run with this > additional demand on memory. > This was added as part of HDFS-10797 as a way of keeping track of INodes that > have already been accounted for - to avoid double counting. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979544#comment-15979544 ] Andrew Wang commented on HDFS-11402: I discussed offline with Manoj on the stride, he's going to make a rev. Otherwise looks good to me! > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11693) Ozone: Add archive support to containers
[ https://issues.apache.org/jira/browse/HDFS-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-11693: Status: Patch Available (was: Open) > Ozone: Add archive support to containers > > > Key: HDFS-11693 > URL: https://issues.apache.org/jira/browse/HDFS-11693 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-11693-HDFS-7240.001.patch > > > Add archive support to containers. This is a stepping stone to supporting > copy containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11693) Ozone: Add archive support to containers
[ https://issues.apache.org/jira/browse/HDFS-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979534#comment-15979534 ] Anu Engineer commented on HDFS-11693: - This patch contains classes that allow us to manage archives. This will be used later by copy container, the signature of the changes are in protoc. I have also fixed a bad name in the protoc file, that causes some churn. > Ozone: Add archive support to containers > > > Key: HDFS-11693 > URL: https://issues.apache.org/jira/browse/HDFS-11693 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-11693-HDFS-7240.001.patch > > > Add archive support to containers. This is a stepping stone to supporting > copy containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11693) Ozone: Add archive support to containers
[ https://issues.apache.org/jira/browse/HDFS-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-11693: Attachment: HDFS-11693-HDFS-7240.001.patch > Ozone: Add archive support to containers > > > Key: HDFS-11693 > URL: https://issues.apache.org/jira/browse/HDFS-11693 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-11693-HDFS-7240.001.patch > > > Add archive support to containers. This is a stepping stone to supporting > copy containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11574) Spelling mistakes in the Java source
[ https://issues.apache.org/jira/browse/HDFS-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-11574: -- Assignee: hu xiaodong > Spelling mistakes in the Java source > > > Key: HDFS-11574 > URL: https://issues.apache.org/jira/browse/HDFS-11574 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hu xiaodong >Assignee: hu xiaodong >Priority: Trivial > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11574.001.patch > > > I found spelling mistakes in the Hadoop java source files viz. recieved > instead of received. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11428) Change setErasureCodingPolicy to take a required string EC policy name
[ https://issues.apache.org/jira/browse/HDFS-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11428: --- Release Note: {{HdfsAdmin#setErasureCodingPolicy}} now takes a String {{ecPolicyName}} rather than an ErasureCodingPolicy object. The corresponding RPC's wire format has also been modified. > Change setErasureCodingPolicy to take a required string EC policy name > -- > > Key: HDFS-11428 > URL: https://issues.apache.org/jira/browse/HDFS-11428 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Labels: hdfs-ec-3.0-must-do > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11428.001.patch, HDFS-11428.002.patch, > HDFS-11428.003.patch, HDFS-11428.004.patch, HDFS-11428.005.patch > > > The current {{setErasureCodingPolicy}} API takes an optional {{ECPolicy}}. > This makes calling the API harder for clients, since they need to turn a > specified name into a policy, and the set of available EC policies is only > available on the NN. > You can see this awkwardness in the current EC cli set command: it first > fetches the list of EC policies, looks for the one specified by the user, > then calls set. This means we need to issue two RPCs for every set > (inefficient), and we need to do validation on the NN side anyway (extraneous > work). > Since we're phasing out the system default EC policy, it also makes sense to > make the policy a required parameter. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11644) DFSStripedOutputStream should not implement Syncable
[ https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979522#comment-15979522 ] Andrew Wang commented on HDFS-11644: Any progress on this one? It's a blocker for alpha3 since it breaks YARN and HBase on EC as it is. If this is involved, we have the quick-fix option of just making EC hflush/hsync no-ops without the new capabilities API, but it doesn't sound too complicated based on Steve's description. > DFSStripedOutputStream should not implement Syncable > > > Key: HDFS-11644 > URL: https://issues.apache.org/jira/browse/HDFS-11644 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Manoj Govindassamy >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > > FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, > calls hsync. Otherwise, it just calls flush. This is used, for instance, by > YARN's FileSystemTimelineWriter. > DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. > However, DFSStripedOS throws a runtime exception when the Syncable methods > are called. > We should refactor the inheritance structure so DFSStripedOS does not > implement Syncable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory
[ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979508#comment-15979508 ] Junping Du commented on HDFS-11661: --- Thanks [~nroberts] and all for reporting the issue. +1 on reverting HDFS-10797 if we don't have a quick fix here. Just pinged the patch authors of HDFS-10797 to see if any magic can happen in short term. :) > GetContentSummary uses excessive amounts of memory > -- > > Key: HDFS-11661 > URL: https://issues.apache.org/jira/browse/HDFS-11661 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Nathan Roberts >Priority: Blocker > Attachments: Heap growth.png > > > ContentSummaryComputationContext::nodeIncluded() is being used to keep track > of all INodes visited during the current content summary calculation. This > can be all of the INodes in the filesystem, making for a VERY large hash > table. This simply won't work on large filesystems. > We noticed this after upgrading a namenode with ~100Million filesystem > objects was spending significantly more time in GC. Fortunately this system > had some memory breathing room, other clusters we have will not run with this > additional demand on memory. > This was added as part of HDFS-10797 as a way of keeping track of INodes that > have already been accounted for - to avoid double counting. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice
[ https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979494#comment-15979494 ] Junping Du edited comment on HDFS-10797 at 4/21/17 10:28 PM: - Hi [~mackrorysd] and [~xiaochen], can you take a look at HDFS-11661 which could be caused by improvement here? Thx! was (Author: djp): Hi [~mackrorysd], can you take a look at HDFS-11661 which could be caused by improvement here? Thx! > Disk usage summary of snapshots causes renamed blocks to get counted twice > -- > > Key: HDFS-10797 > URL: https://issues.apache.org/jira/browse/HDFS-10797 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Affects Versions: 2.8.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, > HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, > HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, > HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch > > > DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how > much disk usage is used by a snapshot by tallying up the files in the > snapshot that have since been deleted (that way it won't overlap with regular > files whose disk usage is computed separately). However that is determined > from a diff that shows moved (to Trash or otherwise) or renamed files as a > deletion and a creation operation that may overlap with the list of blocks. > Only the deletion operation is taken into consideration, and this causes > those blocks to get represented twice in the disk usage tallying. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice
[ https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979494#comment-15979494 ] Junping Du commented on HDFS-10797: --- Hi [~mackrorysd], can you take a look at HDFS-11661 which could be caused by improvement here? Thx! > Disk usage summary of snapshots causes renamed blocks to get counted twice > -- > > Key: HDFS-10797 > URL: https://issues.apache.org/jira/browse/HDFS-10797 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Affects Versions: 2.8.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, > HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, > HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, > HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch > > > DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how > much disk usage is used by a snapshot by tallying up the files in the > snapshot that have since been deleted (that way it won't overlap with regular > files whose disk usage is computed separately). However that is determined > from a diff that shows moved (to Trash or otherwise) or renamed files as a > deletion and a creation operation that may overlap with the list of blocks. > Only the deletion operation is taken into consideration, and this causes > those blocks to get represented twice in the disk usage tallying. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11693) Ozone: Add archive support to containers
Anu Engineer created HDFS-11693: --- Summary: Ozone: Add archive support to containers Key: HDFS-11693 URL: https://issues.apache.org/jira/browse/HDFS-11693 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Anu Engineer Assignee: Anu Engineer Add archive support to containers. This is a stepping stone to supporting copy containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11663) [READ] Fix NullPointerException in ProvidedBlocksBuilder
[ https://issues.apache.org/jira/browse/HDFS-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11663: -- Status: Patch Available (was: Open) > [READ] Fix NullPointerException in ProvidedBlocksBuilder > > > Key: HDFS-11663 > URL: https://issues.apache.org/jira/browse/HDFS-11663 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11663-HDFS-9806.001.patch, > HDFS-11663-HDFS-9806.002.patch > > > When there are no Datanodes with PROVIDED storage, > {{ProvidedBlocksBuilder#build}} leads to a {{NullPointerException}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11663) [READ] Fix NullPointerException in ProvidedBlocksBuilder
[ https://issues.apache.org/jira/browse/HDFS-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11663: -- Attachment: HDFS-11663-HDFS-9806.002.patch > [READ] Fix NullPointerException in ProvidedBlocksBuilder > > > Key: HDFS-11663 > URL: https://issues.apache.org/jira/browse/HDFS-11663 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11663-HDFS-9806.001.patch, > HDFS-11663-HDFS-9806.002.patch > > > When there are no Datanodes with PROVIDED storage, > {{ProvidedBlocksBuilder#build}} leads to a {{NullPointerException}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11663) [READ] Fix NullPointerException in ProvidedBlocksBuilder
[ https://issues.apache.org/jira/browse/HDFS-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11663: -- Status: Open (was: Patch Available) > [READ] Fix NullPointerException in ProvidedBlocksBuilder > > > Key: HDFS-11663 > URL: https://issues.apache.org/jira/browse/HDFS-11663 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11663-HDFS-9806.001.patch > > > When there are no Datanodes with PROVIDED storage, > {{ProvidedBlocksBuilder#build}} leads to a {{NullPointerException}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9922) Upgrade Domain placement policy status marks a good block in violation when there are decommissioned nodes
[ https://issues.apache.org/jira/browse/HDFS-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979473#comment-15979473 ] Ming Ma commented on HDFS-9922: --- Currently upgrade domain isn't considered available in 2.8 due to these changes. If we want the feature to be in 2.8, the major backport item is HDFS-9005. > Upgrade Domain placement policy status marks a good block in violation when > there are decommissioned nodes > -- > > Key: HDFS-9922 > URL: https://issues.apache.org/jira/browse/HDFS-9922 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-9922-trunk-v1.patch, HDFS-9922-trunk-v2.patch, > HDFS-9922-trunk-v3.patch, HDFS-9922-trunk-v4.patch > > > When there are replicas of a block on a decommissioned node, > BlockPlacementStatusWithUpgradeDomain#isUpgradeDomainPolicySatisfied returns > false when it should return true. This is because numberOfReplicas is the > number of in-service replicas for the block and upgradeDomains.size() is the > number of upgrade domains across all replicas of the block. Specifically, we > hit this scenario when numberOfReplicas is equal to upgradeDomainFactor and > upgradeDomains.size() is greater than numberOfReplicas. > {code} > private boolean isUpgradeDomainPolicySatisfied() { > if (numberOfReplicas <= upgradeDomainFactor) { > return (numberOfReplicas == upgradeDomains.size()); > } else { > return upgradeDomains.size() >= upgradeDomainFactor; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11689: --- Fix Version/s: 2.9.0 > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11667) Block Storage:Handling flushing of incomplete block id buffers during shutdown
[ https://issues.apache.org/jira/browse/HDFS-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-11667: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Block Storage:Handling flushing of incomplete block id buffers during shutdown > -- > > Key: HDFS-11667 > URL: https://issues.apache.org/jira/browse/HDFS-11667 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11667-HDFS-7240.001.patch, > HDFS-11667-HDFS-7240.002.patch > > > Currently, whenever the cache shutdown not, zero blocks are written to > DirtyLog. This change will ensure that only required number of blocks are > written to the DirtyLog. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11365) Log portnumber in PrivilegedNfsGatewayStarter
[ https://issues.apache.org/jira/browse/HDFS-11365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11365: --- Fix Version/s: (was: 3.0.0-alpha2) 3.0.0-alpha3 > Log portnumber in PrivilegedNfsGatewayStarter > - > > Key: HDFS-11365 > URL: https://issues.apache.org/jira/browse/HDFS-11365 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11365.001.patch > > > Port number in PrivilegedNfsGatewayStarter should be logged. > This would be useful in cases where bind fails on the port. > This can happen because this port number is in use by another application. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11369) Change exception message in StorageLocationChecker
[ https://issues.apache.org/jira/browse/HDFS-11369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11369: --- Fix Version/s: 3.0.0-alpha3 > Change exception message in StorageLocationChecker > -- > > Key: HDFS-11369 > URL: https://issues.apache.org/jira/browse/HDFS-11369 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11369.01.patch > > > Change an exception message in StorageLocationChecker.java to use the same > format that was used by the DataNode before HDFS-9. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979462#comment-15979462 ] Andrew Wang commented on HDFS-8498: --- Is there a plan to resolve this JIRA? Been pending for about a month. > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch, > HDFS-8498.branch-2.001.patch, HDFS-8498.branch-2.7.001.patch, > HDFS-8498.branch-2.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11112) Journal Nodes should refuse to format non-empty directories
[ https://issues.apache.org/jira/browse/HDFS-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-2: --- Fix Version/s: 3.0.0-alpha3 > Journal Nodes should refuse to format non-empty directories > --- > > Key: HDFS-2 > URL: https://issues.apache.org/jira/browse/HDFS-2 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arpit Agarwal >Assignee: Yiqun Lin > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-2.001.patch, HDFS-2.002.patch > > > Journal Nodes should reject the {{format}} RPC request if a storage directory > is non-empty. The relevant code is in {{JNStorage#format}}. > {code} > void format(NamespaceInfo nsInfo) throws IOException { > setStorageInfo(nsInfo); > ... > unlockAll(); > sd.clearDirectory(); > writeProperties(sd); > createPaxosDir(); > analyzeStorage(); > {code} > This would make the behavior similar to {{namenode -format -nonInteractive}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11479) Socket re-use address option should be used in SimpleUdpServer
[ https://issues.apache.org/jira/browse/HDFS-11479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11479: --- Fix Version/s: 3.0.0-alpha3 > Socket re-use address option should be used in SimpleUdpServer > -- > > Key: HDFS-11479 > URL: https://issues.apache.org/jira/browse/HDFS-11479 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11479.001.patch > > > Nfs gateway restart can fail because of bind error in SimpleUdpServer. > re-use address option should be used in SimpleUdpServer to so that socket > bind can happen when it is in TIME_WAIT state > {noformat} > 2017-02-28 04:19:53,495 FATAL mount.MountdBase > (MountdBase.java:startUDPServer(66)) - Failed to start the UDP server. > org.jboss.netty.channel.ChannelException: Failed to bind to: > 0.0.0.0/0.0.0.0:4242 > at > org.jboss.netty.bootstrap.ConnectionlessBootstrap.bind(ConnectionlessBootstrap.java:204) > at > org.apache.hadoop.oncrpc.SimpleUdpServer.run(SimpleUdpServer.java:68) > at > org.apache.hadoop.mount.MountdBase.startUDPServer(MountdBase.java:64) > at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:97) > at > org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56) > at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69) > at > org.apache.hadoop.hdfs.nfs.nfs3.PrivilegedNfsGatewayStarter.start(PrivilegedNfsGatewayStarter.java:71) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:691) > at > sun.nio.ch.DatagramSocketAdaptor.bind(DatagramSocketAdaptor.java:91) > at > org.jboss.netty.channel.socket.nio.NioDatagramPipelineSink.bind(NioDatagramPipelineSink.java:129) > at > org.jboss.netty.channel.socket.nio.NioDatagramPipelineSink.eventSunk(NioDatagramPipelineSink.java:77) > at org.jboss.netty.channel.Channels.bind(Channels.java:561) > at > org.jboss.netty.channel.AbstractChannel.bind(AbstractChannel.java:189) > at > org.jboss.netty.bootstrap.ConnectionlessBootstrap.bind(ConnectionlessBootstrap.java:198) > ... 11 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11476) Fix NPE in FsDatasetImpl#checkAndUpdate
[ https://issues.apache.org/jira/browse/HDFS-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11476: --- Fix Version/s: 3.0.0-alpha3 > Fix NPE in FsDatasetImpl#checkAndUpdate > --- > > Key: HDFS-11476 > URL: https://issues.apache.org/jira/browse/HDFS-11476 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11476.000.patch, HDFS-11476.001.patch, > HDFS-11476.002.patch, HDFS-11476.003.patch > > > diskMetaFile can be null and passed to compareTo which dereferences it, > causing NPE > {code} > // Compare generation stamp > if (memBlockInfo.getGenerationStamp() != diskGS) { > File memMetaFile = FsDatasetUtil.getMetaFile(diskFile, > memBlockInfo.getGenerationStamp()); > if (memMetaFile.exists()) { > if (memMetaFile.compareTo(diskMetaFile) != 0) { > LOG.warn("Metadata file in memory " > + memMetaFile.getAbsolutePath() > + " does not match file found by scan " > + (diskMetaFile == null? null: > diskMetaFile.getAbsolutePath())); > } > } else { > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11403) Zookeper ACLs on NN HA enabled clusters to be handled consistently
[ https://issues.apache.org/jira/browse/HDFS-11403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11403: --- Fix Version/s: 3.0.0-alpha3 > Zookeper ACLs on NN HA enabled clusters to be handled consistently > -- > > Key: HDFS-11403 > URL: https://issues.apache.org/jira/browse/HDFS-11403 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Laszlo Puskas >Assignee: Hanisha Koneru > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11403.000.patch, HDFS-11403.001.patch > > > On clusters where NN HA is enabled zookeper ACLs need to be handled > consistently when enabling security. > The current behavior is as follows: > * if HA is enabled before the cluster is made secure, proper ACLs are only > set on the leaf znodes, while there's no ACLs set on the path > (eg.:/hadoop-ha/mycluster/ActiveStandbyElectorLock) > * if HA is enabled after the cluster is made secure ACLs are set on the root > znode as well -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11374) Skip FSync in Test util CreateEditsLog to speed up edit log generation
[ https://issues.apache.org/jira/browse/HDFS-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11374: --- Fix Version/s: 3.0.0-alpha3 > Skip FSync in Test util CreateEditsLog to speed up edit log generation > -- > > Key: HDFS-11374 > URL: https://issues.apache.org/jira/browse/HDFS-11374 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11374.000.patch > > > We can make generation of edit logs faster by skipping Fsync in > CreateEditLogs. This can be done be setting _shouldSkipFsyncForTesting_ to > true in _EditLogFileOutputStream_ -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11628) Clarify the behavior of HDFS Mover in documentation
[ https://issues.apache.org/jira/browse/HDFS-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11628: --- Fix Version/s: (was: 3.0.0-alpha2) 3.0.0-alpha3 > Clarify the behavior of HDFS Mover in documentation > --- > > Key: HDFS-11628 > URL: https://issues.apache.org/jira/browse/HDFS-11628 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Labels: docs > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11628.000.patch, HDFS-11628.001.patch > > > It's helpful to state that Mover always tries to move block replicas within > the same node whenever possible. If that is not possible (e.g. when a node > doesn’t have the target storage type) then it will copy the block replica to > another node over the network. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11421) Make WebHDFS' ACLs RegEx configurable
[ https://issues.apache.org/jira/browse/HDFS-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11421: --- Resolution: Fixed Status: Resolved (was: Patch Available) Going to resolve this since it's been pending for a month, please reopen if you plan to get the branch-2 patch committed. > Make WebHDFS' ACLs RegEx configurable > - > > Key: HDFS-11421 > URL: https://issues.apache.org/jira/browse/HDFS-11421 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Harsh J >Assignee: Harsh J > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11421.000.patch, HDFS-11421-branch-2.000.patch > > > Part of HDFS-5608 added support for GET/SET ACLs over WebHDFS. This currently > identifies the passed arguments via a hard-coded regex that mandates certain > group and user naming styles. > A similar limitation had existed before for CHOWN and other User/Group set > related operations of WebHDFS, where it was then made configurable via > HDFS-11391 + HDFS-4983. > Such configurability should be allowed for the ACL operations too. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Description: Blocks on {{PROVIDED}} storage should become unavailable if and only if all Datanodes that are configured with {{PROVIDED}} storage become unavailable. Even if one Datanode with {{PROVIDED}} storage is available, all blocks on the {{PROVIDED}} storage should be accessible. (was: Blocks on {{PROVIDED}} storage should become unavailable ifand only if all Datanodes that are configured with {{PROVIDED}} storage become unavailable. Even if one Datanode with {{PROVIDED}} storage is available, all blocks on the {{PROVIDED}} storage should be accessible.) > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable if and only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Description: Blocks on {{PROVIDED}} storage should become unavailable ifand only if all Datanodes that are configured with {{PROVIDED}} storage become unavailable. Even if one Datanode with {{PROVIDED}} storage is available, all blocks on the {{PROVIDED}} storage should be accessible. (was: Blocks on {{PROVIDED}} storage should become unavailable only if all Datanodes that are configured with {{PROVIDED}} storage become unavailable. Even if one Datanode with {{PROVIDED}} storage is available, all blocks on the {{PROVIDED}} storage should be accessible.) > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable ifand only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Status: Patch Available (was: Open) > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11387) Socket reuse address option is not honored in PrivilegedNfsGatewayStarter
[ https://issues.apache.org/jira/browse/HDFS-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11387: --- Fix Version/s: 3.0.0-alpha3 > Socket reuse address option is not honored in PrivilegedNfsGatewayStarter > - > > Key: HDFS-11387 > URL: https://issues.apache.org/jira/browse/HDFS-11387 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11387.001.patch > > > Socket resuse address option is not honored in PrivilegedNfsGatewayStarter > This happens because the registrationSocket is first bind and then the > options is set. > According to from > https://docs.oracle.com/javase/7/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR > Changing the value of this socket option after the socket is bound has no > effect. The default value of this socket option is system dependent. > Need to change this to first set the option on the socket and then bind it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Status: Open (was: Patch Available) > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6648) Order of namenodes in ConfiguredFailoverProxyProvider is undefined
[ https://issues.apache.org/jira/browse/HDFS-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6648: -- Fix Version/s: 3.0.0-alpha3 > Order of namenodes in ConfiguredFailoverProxyProvider is undefined > -- > > Key: HDFS-6648 > URL: https://issues.apache.org/jira/browse/HDFS-6648 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, hdfs-client >Affects Versions: 2.7.0 >Reporter: Rafal Wojdyla >Assignee: Inigo Goiri > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-6648-000.patch, HDFS-6648-001.patch > > > In org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider, > in the constructor, there's a map service-rpc-address > > (DFSUtil.getHaNnRpcAddresses). It's a LinkedHashMap > of HashMaps. The order is kept for _nameservices_. Then to find active > namenode, for nameservice, we get HashMap of service-rpc-address > for requested nameservice (taken from URI request), And > for this HashMap we get values - order of this collection is not strictly > defined! In the code: > {code} > Collection addressesOfNns = addressesInNN.values(); > {code} > And then we put these values (in not defined order) into ArrayList of > proxies, and then in getProxy we start from first proxy in the list and > failover to next if needed. > It would make sense for ConfiguredFailoverProxyProvider to keep order of > proxies/namenodes defined in hdfs-site.xml. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Status: Patch Available (was: Open) > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanodes with PROVIDED storage
[ https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11673: -- Attachment: HDFS-11673-HDFS-9806.001.patch Attaching an initial version of the patch (includes changes that should go into HDFS-11681). > [READ] Handle failures of Datanodes with PROVIDED storage > - > > Key: HDFS-11673 > URL: https://issues.apache.org/jira/browse/HDFS-11673 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11673-HDFS-9806.001.patch > > > Blocks on {{PROVIDED}} storage should become unavailable only if all > Datanodes that are configured with {{PROVIDED}} storage become unavailable. > Even if one Datanode with {{PROVIDED}} storage is available, all blocks on > the {{PROVIDED}} storage should be accessible. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-11689: - Resolution: Fixed Fix Version/s: 2.8.1 3.0.0-alpha3 Status: Resolved (was: Patch Available) > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 3.0.0-alpha3, 2.8.1 > > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979415#comment-15979415 ] Yongjun Zhang commented on HDFS-11689: -- Thanks a lot [~shahrs87], [~daryn], [~aw] and [~andrew.wang]! I committed to trunk, branch-2, branch-2.8, branch-2.8.1 > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979415#comment-15979415 ] Yongjun Zhang edited comment on HDFS-11689 at 4/21/17 9:27 PM: --- Thanks a lot [~shahrs87], [~daryn], [~aw], [~steve_l] and [~andrew.wang]! I committed to trunk, branch-2, branch-2.8, branch-2.8.1 was (Author: yzhangal): Thanks a lot [~shahrs87], [~daryn], [~aw] and [~andrew.wang]! I committed to trunk, branch-2, branch-2.8, branch-2.8.1 > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979412#comment-15979412 ] Hudson commented on HDFS-11689: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11621 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11621/]) HDFS-11689. New exception thrown by DFSClient%isHDFSEncryptionEnabled (yzhang: rev 5078df7be317e635615c05c5da3285798993ff1f) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11653) [READ] ProvidedReplica should return an InputStream that is bounded by its length
[ https://issues.apache.org/jira/browse/HDFS-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979406#comment-15979406 ] Hadoop QA commented on HDFS-11653: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 93m 13s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 29s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}175m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.TestErasureCodeBenchmarkThroughput | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11653 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12864541/HDFS-11653-HDFS-9806.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a2cd2cd30e27 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-9806 / 79f2885 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19173/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19173/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19173/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [READ] ProvidedReplica should return an InputStream that is bounded by its > length > - > > Key: HDFS-11653 > URL:
[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-10620: --- Fix Version/s: (was: 3.0.0-alpha3) 3.0.0-alpha1 > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-10620.001.patch, HDFS-10620.002.patch, > HDFS-10620-branch-2.01.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11450) HDFS specific network topology classes with storage type info included
[ https://issues.apache.org/jira/browse/HDFS-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979396#comment-15979396 ] Andrew Wang commented on HDFS-11450: Seems like this one also used HDFS-11419 rather than HDFS-11450 in the commit message. > HDFS specific network topology classes with storage type info included > -- > > Key: HDFS-11450 > URL: https://issues.apache.org/jira/browse/HDFS-11450 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11450.001.patch, HDFS-11450.002.patch, > HDFS-11450.003.patch, HDFS-11450.004.patch > > > This JIRA adds storage type info into network topology. > More specifically, this JIRA adds a storage type map by extending > {{InnerNodeImpl}} to describe the available storages under the current node's > subtree. This map is updated when a node is added/removed from the subtree. > With this info, when choosing a random node with storage type requirement, > the search could then decide to/not to go deeper into a subtree by examining > the available storage types first. > One to-do item still, is that, we might still need to separately handle the > cases where a Datanodes restarts, or a disk is hot-swapped, will file another > JIRA in that case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11514) DFSTopologyNodeImpl#chooseRandom optimizations
[ https://issues.apache.org/jira/browse/HDFS-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979393#comment-15979393 ] Andrew Wang commented on HDFS-11514: Looks like the commit msg JIRA # was typo'd as HDFS-11419. > DFSTopologyNodeImpl#chooseRandom optimizations > -- > > Key: HDFS-11514 > URL: https://issues.apache.org/jira/browse/HDFS-11514 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11514.001.patch, HDFS-11514.002.patch, > HDFS-11514.003.patch > > > Based on the offline discussion, one potential improvement to the > {{chooseRandomWithStorageType}} added in HDFS-11482 is that, currently given > a node, the method iterates all its children to sum up the number of > candidate datanodes. Since datanode status change is much less frequent than > block placement request. It is more efficient to get rid of this iteration > check, by probably maintaining another disk type counter map. This JIRA > tracks (but not limited) this optimization. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Attachment: HDFS-11402.07.patch Thanks for the review comments [~andrew.wang]. Attached v07 patch to address the following comments. Please take a look. bq. Would also be interested to know how you chose 512 inodes/thread as it seems like a magic number, was this based on benchmarking? This approach addresses concerns on unnecessary thread pool for small number of open files. Yes, I did testing with 1K,5K,10K,15K,50K open files and chose this number when thread pool started to make some improvement. bq. the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ? I was concerned about having to do another while loop when the former approach coalesced both into one. Anyway, the added complexity isn't much. Changed the code as suggested. bq. isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now. Done. bq. hasReadLock checks hasWriteLock, so the additional hasWriteLock check is unnecessary: Done. bq. We could use a stride increment to simplify the work partitioning logic (and make work distribution more even): The suggested approach doesn't collect partition inodes at once for the worker and needs multiple iteration. If something can be done better here, will take it up in another jira. bq. We aren't using workerIdx when getting the future results, so can simplify with foreach: Done. bq. Mind filing a JIRA to migrate LeaseManager over to SLF4J as well? We're adding a pretty gnarly new log that would be more readable with SLF4j. Sure, will update after filing the jira. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch, > HDFS-11402.03.patch, HDFS-11402.04.patch, HDFS-11402.05.patch, > HDFS-11402.06.patch, HDFS-11402.07.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though
[jira] [Commented] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979379#comment-15979379 ] Hadoop QA commented on HDFS-11691: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11691 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12864571/HDFS-11691.patch | | Optional Tests | asflicense | | uname | Linux a1fa35c3cf7b 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5078df7 | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19174/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-11691: - Assignee: Kihwal Lee > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11691: -- Status: Patch Available (was: Open) > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11691: -- Attachment: HDFS-11691.patch > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11691.patch > > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11663) [READ] Fix NullPointerException in ProvidedBlocksBuilder
[ https://issues.apache.org/jira/browse/HDFS-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979361#comment-15979361 ] Hadoop QA commented on HDFS-11663: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 38s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 46s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s{color} | {color:green} HDFS-9806 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s{color} | {color:red} hadoop-tools/hadoop-fs2img in HDFS-9806 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} HDFS-9806 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 12 new + 20 unchanged - 2 fixed = 32 total (was 22) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 26s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 40s{color} | {color:red} hadoop-fs2img in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.namenode.TestNameNodeProvidedImplementation | | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11663 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863692/HDFS-11663-HDFS-9806.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 7fdab0031839 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-9806 / 79f2885 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | findbugs |
[jira] [Commented] (HDFS-11643) Balancer fencing fails when writing erasure coded lock file
[ https://issues.apache.org/jira/browse/HDFS-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979355#comment-15979355 ] Andrew Wang commented on HDFS-11643: Not sure if this is related, but running TestBalancer also makes this file in my source tree, rather than in a target folder: hadoop-hdfs-project/hadoop-hdfs/include-hosts-file > Balancer fencing fails when writing erasure coded lock file > --- > > Key: HDFS-11643 > URL: https://issues.apache.org/jira/browse/HDFS-11643 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: SammiChen >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11643.001.patch, HDFS-11643.002.patch, > HDFS-11643.003.patch, HDFS-11643.004.patch, HDFS-11643.005.patch > > > At startup, the balancer writes its hostname to the lock file and calls > hflush(). hflush is not supported for EC files, so this fails when the entire > filesystem is erasure coded. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979353#comment-15979353 ] Rushabh S Shah commented on HDFS-11689: --- +1 for the latest patch. [~yzhangal]: thanks ! > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979353#comment-15979353 ] Rushabh S Shah edited comment on HDFS-11689 at 4/21/17 8:49 PM: +1 (non-binding) for the latest patch. [~yzhangal]: thanks ! was (Author: shahrs87): +1 for the latest patch. [~yzhangal]: thanks ! > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11691: -- Description: On the datanodes page of the namenode web UI, the datanode links may not be correct if the namenode is serving the page through http but https is also enabled. This is because {{dfshealth.js}} does not put a proper scheme in front of the address. It already determines whether the address is non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what it is currently setting. The existing mechanism would work for YARN and MAPRED, since they can only serve one protocol, HTTP or HTTPS. was: On the datanodes page of the namenode web UI, the datanode links may not be correct if the namenode is serving the page through http but https is also enabled. This is because {{dfshealth.html}} does not put a proper scheme in front of the address. It already determines whether the address is non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what it is currently setting. The existing mechanism would work for YARN and MAPRED, since they can only serve one protocol, HTTP or HTTPS. > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.js}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11689) New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-11689: --- Summary: New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code (was: New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code) > New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11689) New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-11689: - Summary: New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code (was: New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code ) > New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive > code > --- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11643) Balancer fencing fails when writing erasure coded lock file
[ https://issues.apache.org/jira/browse/HDFS-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979337#comment-15979337 ] Andrew Wang commented on HDFS-11643: I think the TestBalancer issues are related, could you take a look? > Balancer fencing fails when writing erasure coded lock file > --- > > Key: HDFS-11643 > URL: https://issues.apache.org/jira/browse/HDFS-11643 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: SammiChen >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11643.001.patch, HDFS-11643.002.patch, > HDFS-11643.003.patch, HDFS-11643.004.patch, HDFS-11643.005.patch > > > At startup, the balancer writes its hostname to the lock file and calls > hflush(). hflush is not supported for EC files, so this fails when the entire > filesystem is erasure coded. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
[ https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11691: -- Description: On the datanodes page of the namenode web UI, the datanode links may not be correct if the namenode is serving the page through http but https is also enabled. This is because {{dfshealth.html}} does not put a proper scheme in front of the address. It already determines whether the address is non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what it is currently setting. The existing mechanism would work for YARN and MAPRED, since they can only serve one protocol, HTTP or HTTPS. was: On the datanodes page of the namenode web UI, the datanode links may not be correct if the namenode is serving the page through http but https is also enabled. This is because {{dfshealth.js}} does not put a proper scheme in front of the address. It already determines whether the address is non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what it is currently setting. The existing mechanism would work for YARN and MAPRED, since they can only serve one protocol, HTTP or HTTPS. > Add a proper scheme to the datanode links in NN web UI > -- > > Key: HDFS-11691 > URL: https://issues.apache.org/jira/browse/HDFS-11691 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee > > On the datanodes page of the namenode web UI, the datanode links may not be > correct if the namenode is serving the page through http but https is also > enabled. This is because {{dfshealth.html}} does not put a proper scheme in > front of the address. It already determines whether the address is > non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what > it is currently setting. > The existing mechanism would work for YARN and MAPRED, since they can only > serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11692) Ozone: KSM CLI: Implement KSM Key CLI
[ https://issues.apache.org/jira/browse/HDFS-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-11692: -- Labels: command-line (was: ) > Ozone: KSM CLI: Implement KSM Key CLI > - > > Key: HDFS-11692 > URL: https://issues.apache.org/jira/browse/HDFS-11692 > Project: Hadoop HDFS > Issue Type: Bug > Components: ozone >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Labels: command-line > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11692) Ozone: KSM CLI: Implement KSM Key CLI
Xiaoyu Yao created HDFS-11692: - Summary: Ozone: KSM CLI: Implement KSM Key CLI Key: HDFS-11692 URL: https://issues.apache.org/jira/browse/HDFS-11692 Project: Hadoop HDFS Issue Type: Bug Components: ozone Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-11532) Ozone: SCM: Support listContainers API
[ https://issues.apache.org/jira/browse/HDFS-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao resolved HDFS-11532. --- Resolution: Duplicate > Ozone: SCM: Support listContainers API > -- > > Key: HDFS-11532 > URL: https://issues.apache.org/jira/browse/HDFS-11532 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > > This should allow paging so that we don't return too much in a single RPC > call. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI
Kihwal Lee created HDFS-11691: - Summary: Add a proper scheme to the datanode links in NN web UI Key: HDFS-11691 URL: https://issues.apache.org/jira/browse/HDFS-11691 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee On the datanodes page of the namenode web UI, the datanode links may not be correct if the namenode is serving the page through http but https is also enabled. This is because {{dfshealth.js}} does not put a proper scheme in front of the address. It already determines whether the address is non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what it is currently setting. The existing mechanism would work for YARN and MAPRED, since they can only serve one protocol, HTTP or HTTPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979247#comment-15979247 ] Hadoop QA commented on HDFS-11689: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 61m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 33s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 44s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client generated 1 new + 14 unchanged - 0 fixed = 15 total (was 14) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 45s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | HDFS-11689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12864536/HDFS-11689.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2764aff33d21 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b080338 | | Default Java | 1.8.0_121 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19171/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/19171/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19171/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19171/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > New exception thrown by (private)
[jira] [Created] (HDFS-11690) Ozone: SCM: BlockAPI support close container
Xiaoyu Yao created HDFS-11690: - Summary: Ozone: SCM: BlockAPI support close container Key: HDFS-11690 URL: https://issues.apache.org/jira/browse/HDFS-11690 Project: Hadoop HDFS Issue Type: Bug Affects Versions: HDFS-7240 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao HDFS-11504 introduce block API for SCI. As a follow up, the open containers provisioned for blocks should be closed by BlockManager in a async thread when the container is full or close to full. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11663) [READ] Fix NullPointerException in ProvidedBlocksBuilder
[ https://issues.apache.org/jira/browse/HDFS-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11663: -- Status: Patch Available (was: Open) > [READ] Fix NullPointerException in ProvidedBlocksBuilder > > > Key: HDFS-11663 > URL: https://issues.apache.org/jira/browse/HDFS-11663 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11663-HDFS-9806.001.patch > > > When there are no Datanodes with PROVIDED storage, > {{ProvidedBlocksBuilder#build}} leads to a {{NullPointerException}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11653) [READ] ProvidedReplica should return an InputStream that is bounded by its length
[ https://issues.apache.org/jira/browse/HDFS-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11653: -- Status: Patch Available (was: Open) > [READ] ProvidedReplica should return an InputStream that is bounded by its > length > - > > Key: HDFS-11653 > URL: https://issues.apache.org/jira/browse/HDFS-11653 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-11653-HDFS-9806.001.patch, > HDFS-11653-HDFS-9806.002.patch > > > {{ProvidedReplica#getDataInputStream}} should return an InputStream that is > bounded by {{ProvidedReplica#getBlockDataLength()}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11653) [READ] ProvidedReplica should return an InputStream that is bounded by its length
[ https://issues.apache.org/jira/browse/HDFS-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11653: -- Status: Open (was: Patch Available) > [READ] ProvidedReplica should return an InputStream that is bounded by its > length > - > > Key: HDFS-11653 > URL: https://issues.apache.org/jira/browse/HDFS-11653 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-11653-HDFS-9806.001.patch, > HDFS-11653-HDFS-9806.002.patch > > > {{ProvidedReplica#getDataInputStream}} should return an InputStream that is > bounded by {{ProvidedReplica#getBlockDataLength()}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11653) [READ] ProvidedReplica should return an InputStream that is bounded by its length
[ https://issues.apache.org/jira/browse/HDFS-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11653: -- Attachment: HDFS-11653-HDFS-9806.002.patch Fixing earlier checkstyle issues in the new patch. > [READ] ProvidedReplica should return an InputStream that is bounded by its > length > - > > Key: HDFS-11653 > URL: https://issues.apache.org/jira/browse/HDFS-11653 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-11653-HDFS-9806.001.patch, > HDFS-11653-HDFS-9806.002.patch > > > {{ProvidedReplica#getDataInputStream}} should return an InputStream that is > bounded by {{ProvidedReplica#getBlockDataLength()}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-11190) [READ] Namenode support for data stored in external stores.
[ https://issues.apache.org/jira/browse/HDFS-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti resolved HDFS-11190. --- Resolution: Resolved > [READ] Namenode support for data stored in external stores. > --- > > Key: HDFS-11190 > URL: https://issues.apache.org/jira/browse/HDFS-11190 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11190-HDFS-9806.001.patch, > HDFS-11190-HDFS-9806.002.patch, HDFS-11190-HDFS-9806.003.patch, > HDFS-11190-HDFS-9806.004.patch > > > The goal of this JIRA is to enable the Namenode to know about blocks that are > in {{PROVIDED}} stores and are not necessarily stored on any Datanodes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11190) [READ] Namenode support for data stored in external stores.
[ https://issues.apache.org/jira/browse/HDFS-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979164#comment-15979164 ] Virajith Jalaparti commented on HDFS-11190: --- Agreed. Committed v004 to the HDFS-9806 branch. > [READ] Namenode support for data stored in external stores. > --- > > Key: HDFS-11190 > URL: https://issues.apache.org/jira/browse/HDFS-11190 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11190-HDFS-9806.001.patch, > HDFS-11190-HDFS-9806.002.patch, HDFS-11190-HDFS-9806.003.patch, > HDFS-11190-HDFS-9806.004.patch > > > The goal of this JIRA is to enable the Namenode to know about blocks that are > in {{PROVIDED}} stores and are not necessarily stored on any Datanodes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11687) Add new public encryption APIs required by Hive
[ https://issues.apache.org/jira/browse/HDFS-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979150#comment-15979150 ] Andrew Wang commented on HDFS-11687: I'd like to get Hive off of DistributedFileSystem and DFSClient in Hadoop23Shims if possible. DFS is also a private API, and there's a lot going on here that makes me uncomfortable as an HDFS developer. This is additional work beyond just the immediate scope of this JIRA. As a start, I think we should expose {{getKeyProvider}} in HDFSAdmin. {{isHdfsEncryptionEnabled}} is just a check for if {{getKeyProvider}} is null, so we don't need to expose that too. As part of this, we should also see if the ERROR logging mentioned in HIVE-16047 and HDFS-7931 needs to be quashed some more. > Add new public encryption APIs required by Hive > --- > > Key: HDFS-11687 > URL: https://issues.apache.org/jira/browse/HDFS-11687 > Project: Hadoop HDFS > Issue Type: Improvement > Components: encryption >Affects Versions: 2.6.5 >Reporter: Andrew Wang > > As discovered on HADOOP-14333, Hive is using reflection to get a DFSClient > for its encryption shim. We should provide proper public APIs for getting > this information. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11190) [READ] Namenode support for data stored in external stores.
[ https://issues.apache.org/jira/browse/HDFS-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-11190: -- Status: Open (was: Patch Available) > [READ] Namenode support for data stored in external stores. > --- > > Key: HDFS-11190 > URL: https://issues.apache.org/jira/browse/HDFS-11190 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11190-HDFS-9806.001.patch, > HDFS-11190-HDFS-9806.002.patch, HDFS-11190-HDFS-9806.003.patch, > HDFS-11190-HDFS-9806.004.patch > > > The goal of this JIRA is to enable the Namenode to know about blocks that are > in {{PROVIDED}} stores and are not necessarily stored on any Datanodes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979123#comment-15979123 ] Yongjun Zhang commented on HDFS-11689: -- Andrew explained that @InterfaceAudience.Private is not java build in thus we don't have compiler warning. Uploaded new patch HDFS-11689.001.patch with Deprecated added. Will commit once the jenkins test is finished. Thanks. > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-11689: - Attachment: HDFS-11689.001.patch > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch, HDFS-11689.001.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11538) Move ClientProtocol HA proxies into hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979115#comment-15979115 ] Junping Du commented on HDFS-11538: --- Cool. Thanks Andrew! > Move ClientProtocol HA proxies into hadoop-hdfs-client > -- > > Key: HDFS-11538 > URL: https://issues.apache.org/jira/browse/HDFS-11538 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Huafeng Wang >Priority: Blocker > Fix For: 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11538.001.patch, HDFS-11538.002.patch, > HDFS-11538.003.patch, HDFS-11538-branch-2.001.patch > > > Follow-up for HDFS-11431. We should move this missing class over rather than > pulling in the whole hadoop-hdfs dependency. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports
[ https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979105#comment-15979105 ] Xiaoyu Yao commented on HDFS-11493: --- Thanks [~anu] for working on this. The patch looks good to me. Below are some initial comments on the production code. I will post my review on the unit test later. OzoneConfigKeys.java NIT: Line 91 can we rename "ozone.scm.container.report.processing.lag.seconds" to "ozone.scm.container.report.processing.interval.seconds" NIT: Line 99: can we rename "ozone.scm.max.wait.for.container.reports.in.seconds" to "ozone.scm.container.reports.wait.timeout.seconds" CommandQueue.java NIT: Line 135 "command" -> Commands NodePoolManager.java NIT: unnecessary empty line change PeriodicPool.java Line 26: should we use AtomicLong for totalProcessedCount? Line 92: miss javadoc for getLastProcessTime ContainerReplicationManager.java Line 87: NIT: move before line 79? Line 119: we should ensure the executorService is shutdown properly. Line 121: extra leading space in the name string " Container Reports..." Line 135: NIT: need add an empty line Line 145: can you elaborate on the two cases? Line 190: computerPoolDifference assumes the newPool is made a copy in the method and the oldPool is made a copy by the caller This can be optimized with Java8 or Guava Sets.difference Line 208: PerodicPool#lastProcessTime is never changed with this patch. Can you add TODO: if this will be changed in later patches. Line 243: we restart the poolProcessThread if UncaughtException is hit. Could this cause infinite loop in certain cases? ReplicationDatanodeStateManager.java Line 53: missing java doc for some parameters Line 66: can we define some SCM specific exception here? Line 70: Random r can be moved to member to avoid new Random() for each getContainerReport call. > Ozone: SCM: Add the ability to handle container reports > - > > Key: HDFS-11493 > URL: https://issues.apache.org/jira/browse/HDFS-11493 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: container-replication-storage.pdf, > HDFS-11493-HDFS-7240.001.patch > > > Once a datanode sends the container report it is SCM's responsibility to > determine if the replication levels are acceptable. If it is not, SCM should > initiate a replication request to another datanode. This JIRA tracks how SCM > handles a container report. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979094#comment-15979094 ] Yongjun Zhang commented on HDFS-11689: -- Thanks [~andrew.wang]. I wish the private annotation {code} @InterfaceAudience.Private public class DFSClient implements java.io.Closeable, RemotePeerFactory, DataEncryptionKeyFactory { {code} could trigger a warning on hive build, but it seems not right? Will add deprecate, as a workaround for lack of build warning on the private annotation. > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979089#comment-15979089 ] Andrew Wang commented on HDFS-11689: Adding Deprecated will make Hive's build print a deprecation warning, so it's an additional signal that they're doing something incorrect, and also a reminder for us to change this later. Explanatory comment would be good. > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979082#comment-15979082 ] Yongjun Zhang commented on HDFS-11689: -- Thanks [~steve_l] and [~daryn]. I moved the jira to hdfs project. About "adding @Deprecated to the method", my understanding is that this method is not deprecated. It's hive's mistake to access DFSClient which is private to hadoop. Do we want to make the method deprecate? > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Moved] (HDFS-11689) New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke hacky hive code
[ https://issues.apache.org/jira/browse/HDFS-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang moved HADOOP-14333 to HDFS-11689: --- Affects Version/s: (was: 3.0.0-alpha3) (was: 2.8.1) 2.8.1 3.0.0-alpha3 Key: HDFS-11689 (was: HADOOP-14333) Project: Hadoop HDFS (was: Hadoop Common) > New exception thrown by (private) DFSClient API isHDFSEncryptionEnabled broke > hacky hive code > -- > > Key: HDFS-11689 > URL: https://issues.apache.org/jira/browse/HDFS-11689 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha3, 2.8.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HADOOP-14333.001.patch, HADOOP-14333.002.patch, > HADOOP-14333.003.patch > > > Though Hive should be fixed not to access DFSClient which is private to > HADOOP, removing the throws added by HADOOP-14104 is a quicker solution to > unblock hive. > Hive code > {code} > private boolean isEncryptionEnabled(DFSClient client, Configuration conf) { > try { > DFSClient.class.getMethod("isHDFSEncryptionEnabled"); > } catch (NoSuchMethodException e) { > // the method is available since Hadoop-2.7.1 > // if we run with an older Hadoop, check this ourselves > return !conf.getTrimmed(DFSConfigKeys.DFS_ENCRYPTION_KEY_PROVIDER_URI, > "").isEmpty(); > } > return client.isHDFSEncryptionEnabled(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11190) [READ] Namenode support for data stored in external stores.
[ https://issues.apache.org/jira/browse/HDFS-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979050#comment-15979050 ] Chris Douglas commented on HDFS-11190: -- +1 Let's commit this as a base to the HDFS-9806 branch so we can fix some of the bugs/style issues there. Other than a couple long lines from checkstyle, the rest look benign. The unit test timeouts are familiar false positives. > [READ] Namenode support for data stored in external stores. > --- > > Key: HDFS-11190 > URL: https://issues.apache.org/jira/browse/HDFS-11190 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11190-HDFS-9806.001.patch, > HDFS-11190-HDFS-9806.002.patch, HDFS-11190-HDFS-9806.003.patch, > HDFS-11190-HDFS-9806.004.patch > > > The goal of this JIRA is to enable the Namenode to know about blocks that are > in {{PROVIDED}} stores and are not necessarily stored on any Datanodes. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-9922) Upgrade Domain placement policy status marks a good block in violation when there are decommissioned nodes
[ https://issues.apache.org/jira/browse/HDFS-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978923#comment-15978923 ] Kihwal Lee edited comment on HDFS-9922 at 4/21/17 3:58 PM: --- Don't we need this in 2.8 along with other upgrade domain changes? was (Author: kihwal): Don't we need this in 2.8? > Upgrade Domain placement policy status marks a good block in violation when > there are decommissioned nodes > -- > > Key: HDFS-9922 > URL: https://issues.apache.org/jira/browse/HDFS-9922 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-9922-trunk-v1.patch, HDFS-9922-trunk-v2.patch, > HDFS-9922-trunk-v3.patch, HDFS-9922-trunk-v4.patch > > > When there are replicas of a block on a decommissioned node, > BlockPlacementStatusWithUpgradeDomain#isUpgradeDomainPolicySatisfied returns > false when it should return true. This is because numberOfReplicas is the > number of in-service replicas for the block and upgradeDomains.size() is the > number of upgrade domains across all replicas of the block. Specifically, we > hit this scenario when numberOfReplicas is equal to upgradeDomainFactor and > upgradeDomains.size() is greater than numberOfReplicas. > {code} > private boolean isUpgradeDomainPolicySatisfied() { > if (numberOfReplicas <= upgradeDomainFactor) { > return (numberOfReplicas == upgradeDomains.size()); > } else { > return upgradeDomains.size() >= upgradeDomainFactor; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9922) Upgrade Domain placement policy status marks a good block in violation when there are decommissioned nodes
[ https://issues.apache.org/jira/browse/HDFS-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978923#comment-15978923 ] Kihwal Lee commented on HDFS-9922: -- Don't we need this in 2.8? > Upgrade Domain placement policy status marks a good block in violation when > there are decommissioned nodes > -- > > Key: HDFS-9922 > URL: https://issues.apache.org/jira/browse/HDFS-9922 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-9922-trunk-v1.patch, HDFS-9922-trunk-v2.patch, > HDFS-9922-trunk-v3.patch, HDFS-9922-trunk-v4.patch > > > When there are replicas of a block on a decommissioned node, > BlockPlacementStatusWithUpgradeDomain#isUpgradeDomainPolicySatisfied returns > false when it should return true. This is because numberOfReplicas is the > number of in-service replicas for the block and upgradeDomains.size() is the > number of upgrade domains across all replicas of the block. Specifically, we > hit this scenario when numberOfReplicas is equal to upgradeDomainFactor and > upgradeDomains.size() is greater than numberOfReplicas. > {code} > private boolean isUpgradeDomainPolicySatisfied() { > if (numberOfReplicas <= upgradeDomainFactor) { > return (numberOfReplicas == upgradeDomains.size()); > } else { > return upgradeDomains.size() >= upgradeDomainFactor; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-5567) CacheAdmin operations not supported with viewfs
[ https://issues.apache.org/jira/browse/HDFS-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978816#comment-15978816 ] Brahma Reddy Battula commented on HDFS-5567: FYI. HDFS-11226 is added {{"-fs"}} support. :) > CacheAdmin operations not supported with viewfs > --- > > Key: HDFS-5567 > URL: https://issues.apache.org/jira/browse/HDFS-5567 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching >Affects Versions: 3.0.0-alpha1 >Reporter: Stephen Chu >Assignee: Andras Bokor > > On a federated cluster with viewfs configured, we'll run into the following > error when using CacheAdmin commands: > {code} > bash-4.1$ hdfs cacheadmin -listPools > Exception in thread "main" java.lang.IllegalArgumentException: FileSystem > viewfs://cluster3/ is not an HDFS file system > at org.apache.hadoop.hdfs.tools.CacheAdmin.getDFS(CacheAdmin.java:96) > at > org.apache.hadoop.hdfs.tools.CacheAdmin.access$100(CacheAdmin.java:50) > at > org.apache.hadoop.hdfs.tools.CacheAdmin$ListCachePoolsCommand.run(CacheAdmin.java:748) > at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84) > at org.apache.hadoop.hdfs.tools.CacheAdmin.main(CacheAdmin.java:89) > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-5567) CacheAdmin operations not supported with viewfs
[ https://issues.apache.org/jira/browse/HDFS-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor resolved HDFS-5567. Resolution: Not A Bug Target Version/s: (was: ) > CacheAdmin operations not supported with viewfs > --- > > Key: HDFS-5567 > URL: https://issues.apache.org/jira/browse/HDFS-5567 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching >Affects Versions: 3.0.0-alpha1 >Reporter: Stephen Chu >Assignee: Andras Bokor > > On a federated cluster with viewfs configured, we'll run into the following > error when using CacheAdmin commands: > {code} > bash-4.1$ hdfs cacheadmin -listPools > Exception in thread "main" java.lang.IllegalArgumentException: FileSystem > viewfs://cluster3/ is not an HDFS file system > at org.apache.hadoop.hdfs.tools.CacheAdmin.getDFS(CacheAdmin.java:96) > at > org.apache.hadoop.hdfs.tools.CacheAdmin.access$100(CacheAdmin.java:50) > at > org.apache.hadoop.hdfs.tools.CacheAdmin$ListCachePoolsCommand.run(CacheAdmin.java:748) > at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84) > at org.apache.hadoop.hdfs.tools.CacheAdmin.main(CacheAdmin.java:89) > bash-4.1$ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor resolved HDFS-8802. Resolution: Duplicate > dfs.checksum.type is not described in hdfs-default.xml > -- > > Key: HDFS-8802 > URL: https://issues.apache.org/jira/browse/HDFS-8802 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1 >Reporter: Tsuyoshi Ozawa >Assignee: Andras Bokor > Attachments: HDFS-8802_01.patch, HDFS-8802_02.patch, HDFS-8802.patch > > > It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-8802) dfs.checksum.type is not described in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905274#comment-15905274 ] Andras Bokor edited comment on HDFS-8802 at 4/21/17 1:07 PM: - Fixed by HDFS-8356. was (Author: boky01): Fixed by HDFS-8356. I am not sure about the resolution. Duplicate, maybe? > dfs.checksum.type is not described in hdfs-default.xml > -- > > Key: HDFS-8802 > URL: https://issues.apache.org/jira/browse/HDFS-8802 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1 >Reporter: Tsuyoshi Ozawa >Assignee: Andras Bokor > Attachments: HDFS-8802_01.patch, HDFS-8802_02.patch, HDFS-8802.patch > > > It's a good timing to check other configurations about hdfs-default.xml here. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org