[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519841#comment-16519841 ] Chris Douglas commented on HDFS-10285: -- +1 on [~umamaheswararao]'s proposal. We can refine the code in trunk. > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13186: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.0 Status: Resolved (was: Patch Available) I committed this. Thanks, [~ehiggs] > [PROVIDED Phase 2] Multipart Uploader API > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, > HDFS-13186.009.patch, HDFS-13186.010.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514298#comment-16514298 ] Chris Douglas commented on HDFS-13186: -- v010 just fixes javadoc errors and added more param javadoc for {{MultipartUploader}}. I'm +1 on this and will commit if jenkins comes back clean. > [PROVIDED Phase 2] Multipart Uploader API > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, > HDFS-13186.009.patch, HDFS-13186.010.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13186: - Attachment: HDFS-13186.010.patch > [PROVIDED Phase 2] Multipart Uploader API > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, > HDFS-13186.009.patch, HDFS-13186.010.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13186: - Summary: [PROVIDED Phase 2] Multipart Uploader API (was: [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations) > [PROVIDED Phase 2] Multipart Uploader API > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, > HDFS-13186.009.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13186: - Status: Patch Available (was: Open) > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, > HDFS-13186.009.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508610#comment-16508610 ] Chris Douglas commented on HDFS-13186: -- bq. Does the contract test for the local FS pass, after adding support for PathHandle? Sorry, I meant the {{AbstractContractPathHandleTest}}, which verifies that a FS implements {{PathHandle}} correctly. v008 moves the {{createPathHandle}} code from {{LocalFileSystem}} to {{RawLocalFileSystem}}, since the contract test relies on {{append}} support, and adds the contract test. Unfortunately, the impl (at least without native support) only reports with second precision, so the contract test grew a {{sleep(1000)}}. v008 fixes the javac/checkstyle complaints on v007, but it likely adds others. It also slightly refactors the unit test, so we can run it against both HDFS and LocalFS. If this looks OK to you (and Jenkins), I think it's good to commit. > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13186: - Attachment: HDFS-13186.008.patch > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, > HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption
[ https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508406#comment-16508406 ] Chris Douglas commented on HDFS-13265: -- bq. src/test/resources should only be for unit tests right? Yes; sorry, I missed test vs main > MiniDFSCluster should set reasonable defaults to reduce resource consumption > > > Key: HDFS-13265 > URL: https://issues.apache.org/jira/browse/HDFS-13265 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, test >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13265-branch-2.000.patch, > HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, HDFS-13265.001.patch, > TestMiniDFSClusterThreads.java > > > MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many > of these are not suitable for a unit test environment. For example, the > default handler thread count of 10 is definitely more than necessary for > (almost?) any unit test. We should set reasonable, lower defaults unless a > test specifically requires more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption
[ https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506759#comment-16506759 ] Chris Douglas commented on HDFS-13265: -- bq. I'm thinking that we put the MiniDFSCluster into a minimal resource configuration by default, and then go through the broken tests and fix them +1 Sounds good to me. bq. Would it be possible to add those in hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hdfs-default.xml? They're a little aggressive, aren't they? It might also be disruptive (and difficult to debug) for sites using the current defaults. > MiniDFSCluster should set reasonable defaults to reduce resource consumption > > > Key: HDFS-13265 > URL: https://issues.apache.org/jira/browse/HDFS-13265 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, test >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13265-branch-2.000.patch, > HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, HDFS-13265.001.patch, > TestMiniDFSClusterThreads.java > > > MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many > of these are not suitable for a unit test environment. For example, the > default handler thread count of 10 is definitely more than necessary for > (almost?) any unit test. We should set reasonable, lower defaults unless a > test specifically requires more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption
[ https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502361#comment-16502361 ] Chris Douglas commented on HDFS-13265: -- Sorry for the intermittent attention to the prerequisites, I'm not sure I've paged in all the context. With HDFS-13493 and HDFS-13272 committed, is the remaining work in this JIRA only to use the config knobs for {{MiniDFSCluster}}, in branch-2? > MiniDFSCluster should set reasonable defaults to reduce resource consumption > > > Key: HDFS-13265 > URL: https://issues.apache.org/jira/browse/HDFS-13265 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, test >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13265-branch-2.000.patch, > HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, > TestMiniDFSClusterThreads.java > > > MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many > of these are not suitable for a unit test environment. For example, the > default handler thread count of 10 is definitely more than necessary for > (almost?) any unit test. We should set reasonable, lower defaults unless a > test specifically requires more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502249#comment-16502249 ] Chris Douglas commented on HDFS-13186: -- Only a few minor points: * {{LocalFileSystemPathHandle}} should be in the {{org.apache.hadoop.fs}} package, rather than {{org.apache.hadoop}} * The changes to {{FileSystem}} no longer appear necessary * The filesystem field in {{FileSystemMultipartUploader}}, {{S3AMultipartUploader}} can be final * In {{FileSystemMultipartUploader}}, is it an error to have entries with the same key in the list? * Does the contract test for the local FS pass, after adding support for {{PathHandle}}? * {{MultipartUploader}} could use some more javadoc to guide implementers Otherwise this lgtm. > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router
[ https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501090#comment-16501090 ] Chris Douglas commented on HDFS-13469: -- bq. If it's only for debugging purposes, I wonder if it's ok we just document this isn't supported and one has to do it directly to the NameNode when debugging. It's not a debugging feature. The path-oriented APIs cannot enforce consistent references, so recursive traversal, retrying most operations, and most concurrent modifications are prohibitively difficult to implement correctly. By exposing a reference API, clients can refer to the same object after it's moved/renamed, avoid opening a file renamed on top of the intended referent, etc. The current implementation implements references by exposing inode IDs via a magic path. This has the advantage of working with all operations to {{FileSystem}}, because they all operate on paths. On the other hand, it is brittle and bespoke to HDFS. For HDFS utilities, that's not really a problem. For applications, it's a semi-public specialization for HDFS. For RBF, mapping an inode ID to one of several namespaces- when the referent may move across namenodes- may not be strictly compatible with every case the NN supports today. References may become spuriously invalid (by rebalancing) or the range of inode IDs restricted by partitioning. An alternative approach adds a separate API for references. These will not work automatically with all {{FileSystem}} operations; each operation will need to be added explicitly, as in HDFS-7878. Its implementation currently uses the same metadata as the magic path, but because it is an opaque layer of indirection, we could supplement the reference with other metadata. bq. The only compatible approach is conceptually partitioning the inode id ranges to disambiguate the location of the inode id. Simplest implementation is namesystems use distinct non-overlapping ranges but that only works for new namespaces. It's literally partitioning the inode ID range, which has the same tradeoffs as HDFS-10867. As you pointed out there, adding or removing NNs from an RBF cluster will consume part of the range. bq. Please elaborate on how this might work? Ie. What is and how does the opaque thing map to the underlying namesystem? How will a compatible and simple traversal work? OK, start with semantics equivalent to inode ID partitioning. Instead of mapping inode IDs to a partition of the range, store the target namesystem in a separate field on the {{PathHandle}} reference. RBF can rewrite the reference with its own payload. Instead of a magic path, clients will make similar RPC calls that include the reference. The NN will extract and use the inode ID; in an RBF deployment, the router can use the metadata it added to route the request. Again, partitioning the inode ID space has a much lower implementation cost, but there are tradeoffs. If we're accumulating use cases for references, then adding a separate API to implement them would (a) work for {{FileSystem}} implementations other than HDFS and (b) offer explicit support, rather than an undocumented feature. To be clear, I'm not advocating for explicit references over magic paths, but if RBF isn't remapping magic paths then it should fail instead of blindly forwarding them. > RBF: Support InodeID in the Router > -- > > Key: HDFS-13469 > URL: https://issues.apache.org/jira/browse/HDFS-13469 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The Namenode supports identifying files through inode identifiers. > Currently the Router does not handle this properly, we need to add this > functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13587) TestQuorumJournalManager fails on Windows
[ https://issues.apache.org/jira/browse/HDFS-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487801#comment-16487801 ] Chris Douglas commented on HDFS-13587: -- +1 lgtm > TestQuorumJournalManager fails on Windows > - > > Key: HDFS-13587 > URL: https://issues.apache.org/jira/browse/HDFS-13587 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Anbang Hu >Assignee: Anbang Hu >Priority: Major > Labels: Windows > Attachments: HDFS-13587.000.patch, HDFS-13587.001.patch, > HDFS-13587.002.patch > > > There are 12 test failures in TestQuorumJournalManager on Windows. Local run > shows: > {color:#d04437}[INFO] Running > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager > [ERROR] Tests run: 21, Failures: 0, Errors: 12, Skipped: 0, Time elapsed: > 106.81 s <<< FAILURE! - in > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager > [ERROR] > testCrashBetweenSyncLogAndPersistPaxosData(org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager) > Time elapsed: 1.93 s <<< ERROR! > org.apache.hadoop.hdfs.qjournal.client.QuorumException: > Could not format one or more JournalNodes. 2 successful responses: > 127.0.0.1:27044: null [success] > 127.0.0.1:27064: null [success] > 1 exceptions thrown: > 127.0.0.1:27054: Directory > E:\OSS\hadoop-branch-2\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\journalnode-1\test-journal > is in an inconsistent state: Can't format the storage directory because the > current directory is not empty. > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.checkEmptyCurrent(Storage.java:498) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:574) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.format(JNStorage.java:185) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.format(Journal.java:221) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.format(JournalNodeRpcServer.java:157) > at > org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.format(QJournalProtocolServerSideTranslatorPB.java:145) > at > org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25419) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.format(QuorumJournalManager.java:212) > at > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager.setup(TestQuorumJournalManager.java:109) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at
[jira] [Commented] (HDFS-13587) TestQuorumJournalManager fails on Windows
[ https://issues.apache.org/jira/browse/HDFS-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482852#comment-16482852 ] Chris Douglas commented on HDFS-13587: -- bq. TestQuorumJournalManager originally uses default getBaseDirectory from MiniDFSCluster, which sets the variable to true in MiniDFSCluster Specifically, it causes {{MiniDFSCluster}} to be loaded, which triggers the [static block|https://github.com/apache/hadoop/blob/132a547/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java#L161] that sets this mode. Since the design relies on static initialization dependencies, we're unlikely to find a clean solution. Adding a similar, static block to {{MiniJournalCluster}} looks correct, and it would avoid fixing the same failure in each QJM test. Do you see a problem with this approach? > TestQuorumJournalManager fails on Windows > - > > Key: HDFS-13587 > URL: https://issues.apache.org/jira/browse/HDFS-13587 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Anbang Hu >Assignee: Anbang Hu >Priority: Major > Labels: Windows > Attachments: HDFS-13587.000.patch, HDFS-13587.001.patch > > > There are 12 test failures in TestQuorumJournalManager on Windows. Local run > shows: > {color:#d04437}[INFO] Running > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager > [ERROR] Tests run: 21, Failures: 0, Errors: 12, Skipped: 0, Time elapsed: > 106.81 s <<< FAILURE! - in > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager > [ERROR] > testCrashBetweenSyncLogAndPersistPaxosData(org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager) > Time elapsed: 1.93 s <<< ERROR! > org.apache.hadoop.hdfs.qjournal.client.QuorumException: > Could not format one or more JournalNodes. 2 successful responses: > 127.0.0.1:27044: null [success] > 127.0.0.1:27064: null [success] > 1 exceptions thrown: > 127.0.0.1:27054: Directory > E:\OSS\hadoop-branch-2\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\journalnode-1\test-journal > is in an inconsistent state: Can't format the storage directory because the > current directory is not empty. > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.checkEmptyCurrent(Storage.java:498) > at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:574) > at > org.apache.hadoop.hdfs.qjournal.server.JNStorage.format(JNStorage.java:185) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.format(Journal.java:221) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.format(JournalNodeRpcServer.java:157) > at > org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.format(QJournalProtocolServerSideTranslatorPB.java:145) > at > org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25419) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286) > at > org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.format(QuorumJournalManager.java:212) > at > org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager.setup(TestQuorumJournalManager.java:109) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) >
[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2
[ https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478432#comment-16478432 ] Chris Douglas commented on HDFS-12615: -- bq. Any Jira which has not been released, we should certainly revert. A blanket veto over a clerical error is not on the table. A set of unrelated features and fixes were unhelpfully linked from a single JIRA. Let's clean it up and move significant features- like security- to design doc + branch, leaving bug fixes and the like to normal dev on trunk, as we would with any other module. Particularly for code that only affects RBF, isolating it on a branch doesn't do anything for the stability of trunk. bq. I don't quite agree with the patch count logic, for I have seen patches with are more than 200 KB at times. Let us just say "we know a big feature when we see it" I was referring to the 5/9 patches implementing these features. Legalese defining when a branch is required is not a good use of anyone's time. Let's not belabor it. However, you requested _all_ development move to a branch, based on an impression that you explicitly have not verified. Without seeking objective criteria that apply to all situations, you need to research this particular patchset to ground your claims about _it_. Let's be concrete. You have the impression that (a) changes committed to trunk affect non-RBF deployments and (b) RBF features miss critical cases. Those are demonstrable. bq. I *feel* that a JIRA that has these following features [...] *Sounds like* a major undertaking in my mind and *feels like* these do need a branch and these are significant features. RBF has a significant _roadmap_. A flat list of tasks- with mixed granularity- is a poor way to organize it. bq. [...] since they are not very keen on communicating details, I am proposing that you move all this work to a branch and bring it back when the whole idea is baked Not everyone participating in RBF development has worked in OSS projects, before. It's fine to explore ideas in code, collaboratively, in JIRA. Failing to signal which JIRAs are variations on a theme (e.g., protobuf boilerplate), prototypes, or features affecting non-RBF: that's not OK. Reviewers can't follow every JIRA, they need help finding the relevant signal. Your confidence that people working on RBF are applying reasonable filters and *soliciting* others' opinion is extremely important. From a random sampling, that seems to be happening. Reviewing the code in issues [~arpitagarwal] cited... they may "look non-trivial at a glance", but after a slightly longer glance, they look pretty straightforward to me. Or at least, they follow from the design of the router. bq. Perhaps there is a communication problem here. I am not sure where your assumption comes from; reading the comments on the security patch, I am not able to come to that conclusion. Please take a look at HDFS-12284. [...] If we both are reading the same comments, I am at a loss on how you came to the conclusion that it was a proposal and not a patch. Committing a patch that claims to add security, without asking for broader review, would be absurd. Reading a discussion about a lack of clarity on _delegation tokens_ and concluding that group believes its implementation is ready to merge... that requires more assumptions about those developers' intent and competence than to conclude "prototype". _However_ if it were being developed in a branch, that signal would be unambiguous. bq. It will benefit the RBF feature as well as future maintainers to have a design notes or a detailed change description beyond a 1-line summary because most are large patches >From the samples I read, this is a recurring problem. Many RBF JIRAs should >have more prose around the design tradeoffs, not just comments on the code. >Taking HDFS-13224 as an example, one would need to read the patch to >understand what was implemented, and how. Again, most of these follow from the >RBF design and the larger patch size often comes from PB boilerplate, but >raising the salient details both for review and for future maintainers (I'm >glad you brought this up) is not optional. > Router-based HDFS federation phase 2 > > > Key: HDFS-12615 > URL: https://issues.apache.org/jira/browse/HDFS-12615 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Labels: RBF > > This umbrella JIRA tracks set of improvements over the Router-based HDFS > federation (HDFS-10467). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2
[ https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475338#comment-16475338 ] Chris Douglas commented on HDFS-12615: -- bq. Could you please do me a favor and move all future work and all the JIRAs not released into a branch? "All future work" is too broad, but developing complex features in branches is a fair ask. Would you mind looking through the JIRAs implementing quotas and multi-subcluster mounts, as an exercise? Reasonable people can disagree on minimum criteria requiring a feature branch, but calling a merge vote for 5-9 patches is probably excessive. It is important to distinguish between routine maintenance and significant features. For the latter, design docs should be written, experts' opinion solicited, and implementation should be in a branch if it requires more than a handful of patches. The project can't claim that RBF is secure, then issue a flurry of CVEs that could have been caught in design review. In that particular case, I assume the patch was shared for discussion, not commit. bq. As I said, I feel bad the way this project is handled, that is heaping lots of critical features without even design documents and all these features going directly into the trunk. [~anu], you should look at the code/JIRAs if you're going to make sweeping statements like this. Many features did have design documents. Most JIRAs were repairs/improvements to released functionality or straightforward extensions. The only data point you've cited is the number of subtasks in this JIRA, which is a clerical problem, not a stop-the-world emergency. Since some subset of these JIRAs are already in a release, even that criterion doesn't tie these JIRAs together. Your suggestion- moving JIRAs not part of a release out of this umbrella- makes sense. Let's wrap up this issue as whatever was released, then transition to (a) significant features in branches with subtasks and (b) normal maintenance in individual JIRAs. > Router-based HDFS federation phase 2 > > > Key: HDFS-12615 > URL: https://issues.apache.org/jira/browse/HDFS-12615 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Labels: RBF > > This umbrella JIRA tracks set of improvements over the Router-based HDFS > federation (HDFS-10467). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2
[ https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475098#comment-16475098 ] Chris Douglas commented on HDFS-12615: -- bq. Right now, this JIRA has become too big, and it has interwoven JIRAs(small and large) – like you mentioned the Quotas or Multi-subcluster mount points in the trunk. Agreed, the number of subtasks for this JIRA belies any coherent theme for "Phase II" items. The bug fixes, perf improvements, docs, unit tests, and API cleanup can continue on trunk. As with other modules, they can use tags for tracking. bq. I still think at this stage ( perhaps this should have been done much earlier) that we should open a new branch and revert the changes in trunk and 2.9. Going through the subtasks, this looks like normal development, just over-indexed. If a large feature (like security) relies on a web of JIRAs and/or impacts HDFS significantly, then as you say, that's easier to manage on a branch. > Router-based HDFS federation phase 2 > > > Key: HDFS-12615 > URL: https://issues.apache.org/jira/browse/HDFS-12615 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Labels: RBF > > This umbrella JIRA tracks set of improvements over the Router-based HDFS > federation (HDFS-10467). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13284) Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY
[ https://issues.apache.org/jira/browse/HDFS-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474990#comment-16474990 ] Chris Douglas edited comment on HDFS-13284 at 5/14/18 10:49 PM: Can you be more explicit about the conditions where these heuristics are (in)correct? The description documents the current behavior and the proposed change, but not what it resolves. If the problem is that blocks with only two replicas should be assigned higher priority than those with three (e.g., because you're seeing avoidable high-priority replications), would {{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? Items added to the distributed cache with high replication may be over-prioritized, if we change the heuristic from 1:3 to 1:2. The patch should also update the javadoc, which documents the priority assignment. was (Author: chris.douglas): Can you be more explicit about the conditions where these heuristics are (in)correct? The description documents the current behavior and the proposed change, but not what it resolves. If the problem is that files with only two replicas should be assigned higher priority than those with three (e.g., because you're seeing avoidable high-priority replications), would {{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? Items added to the distributed cache with high replication may be over-prioritized, if we change the heuristic from 1:3 to 1:2. The patch should also update the javadoc, which documents the priority assignment. > Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY > - > > Key: HDFS-13284 > URL: https://issues.apache.org/jira/browse/HDFS-13284 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13284.000.patch, HDFS-13284.001.patch > > > LowRedundancyBlocks currently has 5 priority queues: > QUEUE_HIGHEST_PRIORITY = 0 - reserved for last > replica blocks > QUEUE_VERY_LOW_REDUNDANCY = 1 - *if ((curReplicas * 3) < > expectedReplicas)* > QUEUE_LOW_REDUNDANCY = 2 - the rest > QUEUE_REPLICAS_BADLY_DISTRIBUTED = 3 > QUEUE_WITH_CORRUPT_BLOCKS = 4 > The problem lies in QUEUE_VERY_LOW_REDUNDANCY. Currently, a block that has > curReplicas=2 and expectedReplicas=4 is treated the same as a block with > curReplicas=3 and expectedReplicas=4. A block with 2/3 replicas is also put > into QUEUE_LOW_REDUNDANCY. > The proposal is to change the *{{if ((curReplicas * 3) < expectedReplicas)}}* > check to *{{if ((curReplicas * 2) <= expectedReplicas || curReplicas == 2)}}* > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13284) Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY
[ https://issues.apache.org/jira/browse/HDFS-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474990#comment-16474990 ] Chris Douglas commented on HDFS-13284: -- Can you be more explicit about the conditions where these heuristics are (in)correct? The description documents the current behavior and the proposed change, but not what it resolves. If the problem is that files with only two replicas should be assigned higher priority than those with three (e.g., because you're seeing avoidable high-priority replications), would {{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? Items added to the distributed cache with high replication may be over-prioritized, if we change the heuristic from 1:3 to 1:2. The patch should also update the javadoc, which documents the priority assignment. > Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY > - > > Key: HDFS-13284 > URL: https://issues.apache.org/jira/browse/HDFS-13284 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13284.000.patch, HDFS-13284.001.patch > > > LowRedundancyBlocks currently has 5 priority queues: > QUEUE_HIGHEST_PRIORITY = 0 - reserved for last > replica blocks > QUEUE_VERY_LOW_REDUNDANCY = 1 - *if ((curReplicas * 3) < > expectedReplicas)* > QUEUE_LOW_REDUNDANCY = 2 - the rest > QUEUE_REPLICAS_BADLY_DISTRIBUTED = 3 > QUEUE_WITH_CORRUPT_BLOCKS = 4 > The problem lies in QUEUE_VERY_LOW_REDUNDANCY. Currently, a block that has > curReplicas=2 and expectedReplicas=4 is treated the same as a block with > curReplicas=3 and expectedReplicas=4. A block with 2/3 replicas is also put > into QUEUE_LOW_REDUNDANCY. > The proposal is to change the *{{if ((curReplicas * 3) < expectedReplicas)}}* > check to *{{if ((curReplicas * 2) <= expectedReplicas || curReplicas == 2)}}* > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads
[ https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13272: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.2 Status: Resolved (was: Patch Available) > DataNodeHttpServer to have configurable HttpServer2 threads > --- > > Key: HDFS-13272 > URL: https://issues.apache.org/jira/browse/HDFS-13272 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: 2.9.2 > > Attachments: HDFS-13272-branch-2.000.patch, > HDFS-13272-branch-2.001.patch, testout-HDFS-13272.001, testout-branch-2 > > > In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 > threads. In addition to the possibility of this being too few threads, it is > much higher than necessary in resource constrained environments such as > MiniDFSCluster. To avoid compatibility issues, rather than using > {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new > configuration for the DataNode's thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads
[ https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471243#comment-16471243 ] Chris Douglas commented on HDFS-13272: -- Thanks [~xkrogen], sorry for the delay. bq. There were no new failures with the patch applied and it took about the same time to run all of them {{TestClientProtocolForPipelineRecovery}} also fails on branch-2, the other tests... often pass. This doesn't seem to introduce any regressions. +1 I committed this > DataNodeHttpServer to have configurable HttpServer2 threads > --- > > Key: HDFS-13272 > URL: https://issues.apache.org/jira/browse/HDFS-13272 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: 2.9.2 > > Attachments: HDFS-13272-branch-2.000.patch, > HDFS-13272-branch-2.001.patch, testout-HDFS-13272.001, testout-branch-2 > > > In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 > threads. In addition to the possibility of this being too few threads, it is > much higher than necessary in resource constrained environments such as > MiniDFSCluster. To avoid compatibility issues, rather than using > {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new > configuration for the DataNode's thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461912#comment-16461912 ] Chris Douglas commented on HDFS-13186: -- Thanks for the updated patch. Minor comments still apply, but let's work out the high-level first. bq. MPU is distinct from serial copy since we can upload the data out of order. I don't see the use case for a version that breaks this. In this case, I think the boiler plate is correct: try to make an MPU; if we can't then fall back to FileSystem::write Fair enough. If the caller controls the partitioning, then it can also decide whether the copy should be serial or parallel (e.g., for small files). However, this also requires the caller to implement recovery paths for both cases. So a user of this library would need to persist their {{UploadHandle}} to abort the call for a MPU, and bespoke metadata to recover failed, serial copies. That's not unreasonable, but it means that the client will probably fall back either to copy-in-place/renaming to promote completed output, which (as we've seen with S3/WASB and HDFS) doesn't have a good, default answer. If the MPU were just a U, then this library could also handle promotion in a FS-efficient way. All that said, if this is not intended as a small framework, but rather a common API to parallel upload machinery across S3/WASB and HDFS, then this is perfectly fine. To be a general framework, this would also need a partitioning strategy (with overrides) to generate splits, which resolve to {{PartHandle}} objects. Probably not worth it. bq. So the numbering of the parts would be internal to the part handle when downcasting in the different types. This means we could no longer rely on the fact that it's just a String and would then have to introduce all the protobuf machinery. That's possible; just wanted to highlight it before I go away and implement that. bq. the filePath needs to follow the upload around because various nodes that are calling put part will need to know what type of MultipartUploader to make bq. If we need to serialize anything, I think we should have it be explicitly done through protobuf As with {{PathHandle}} implementations, PB might be useful, but not required. But your point is well taken. Since this would be passed across multiple nodes that may include different versions of the impl's MPU on the classpath, that could create some unnecessarily complicated logic for the MPU implementer. Let's stick with the approach in v004. > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch, HDFS-13186.004.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List> handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router
[ https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459854#comment-16459854 ] Chris Douglas commented on HDFS-13469: -- bq. We can't close a serious oversight/regression in the router by essentially throwing TooHardToSupportException. Exposing and addressing inodes through a magic path is not just hard to extend, it's impossible without compromising compatibility. Option one, we bypass the restricted ID space by changing the abstraction: add explicit, opaque handles to files/directories and use those for directory enumeration/traversal, fetching block locations, etc. Option two, we hack it into inodeid and continue using the magic path. If there are other options then let's explore them, but the constraints are understood and don't need to be belabored. HDFS-7878 was attempting to introduce option one. It's only in 3.1.0, so if it requires extension/modification to serve that purpose then let's do it. Option two has tradeoffs, as in HDFS-10867. How would you like to address this for RBF? > RBF: Support InodeID in the Router > -- > > Key: HDFS-13469 > URL: https://issues.apache.org/jira/browse/HDFS-13469 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The Namenode supports identifying files through inode identifiers. > Currently the Router does not handle this properly, we need to add this > functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13381) [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path
[ https://issues.apache.org/jira/browse/HDFS-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459263#comment-16459263 ] Chris Douglas commented on HDFS-13381: -- bq. We need one unified implementation that uses minimal external calls that can be abstracted in the context to either make rpc or fsn calls Do you have a particular refactoring in mind? In this case, the separate implementations batch their requests differently, because the internal implementation uses {{FSTreeTraverser}}. Looking at the patch, one could make the two implementations more similar, but not significantly more modular or easier to maintain. IIRC, each context implements "queue up the next batch of work". Your point on maintenance is well taken, but {{FileCollector}} looks like a vestigial abstraction. It's not actually pluggable; the internal/external contexts put their scan logic into a class, to match the EC pattern. We are only talking about two implementations of an explicitly internal (i.e., {{Unstable}}) API... Is the problem with the {{FSTreeTraverser}} pattern, generally? You've stated the goal clearly, but if the traversal is generalized then the internal context will do no batching under the lock. Did anyone- including the EC folks- publish any benchmarks/testing that justify this abstraction? If the impl benefits from it, then as a specialization it has merit, but it should be significant. How should we test that? > [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path > > > Key: HDFS-13381 > URL: https://issues.apache.org/jira/browse/HDFS-13381 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R >Priority: Major > Attachments: HDFS-13381-HDFS-10285-00.patch, > HDFS-13381-HDFS-10285-01.patch > > > This Jira task will address the following comments: > # Use DFSUtilClient::makePathFromFileId, instead of generics(one for string > path and another for inodeId) like today. > # Only the context impl differs for external/internal sps. Here, it can > simply move FileCollector and BlockMoveTaskHandler to Context interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13495) RBF: Support Router Admin REST API
[ https://issues.apache.org/jira/browse/HDFS-13495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453285#comment-16453285 ] Chris Douglas commented on HDFS-13495: -- bq. do you have any insight on why there is no REST API for admin? Not sure. Often REST APIs are exposed through a gateway, like Knox. I don't know if these expose admin operations. IIRC there's also a servlet filter that rejects impersonation of admin/system accounts. [~lmc...@apache.org]? > RBF: Support Router Admin REST API > -- > > Key: HDFS-13495 > URL: https://issues.apache.org/jira/browse/HDFS-13495 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Mohammad Arshad >Priority: Major > Labels: RBF > > This JIRA intends to add REST API support for all admin commands. Router > Admin REST APIs can be useful in managing the Routers from a central > management layer tool. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
[ https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449420#comment-16449420 ] Chris Douglas commented on HDFS-13186: -- I like this design. The {{MultiPartUploader}} avoids adding new hooks to {{FileSystem}}, without losing generality or pluggability. High level: * The current impl doesn't define a default for {{FileSystem}} implementations, which could be a serial copy. Instead, it throws an exception. Utilities (like {{FsShell}} or YARN) need to implement some boilerplate for both paths, rather than using a single path that falls back to a serial upload. * Some implementations might benefit from an explicit {{MultipartUploader::abort}}, which may clean up the partial upload. Clearly it can't be guaranteed, but we'd like the property that an {{UploadHandle}} persisted to a WAL could be used for cleanup. * The {{PartHandle}} could retain its ID, rather than providing a {{Pair}} to {{commit}}. This might make repartitioning difficult i.e., splitting a slow {{PartHandle}}, but implementations could implement custom handling if that's important. It would be sufficient for the {{PartHandle}} to be {{Comparable}}, though equality should be treated either as a duplicate or an error at {{complete}} by the {{MultipartUploader}}. * Does the {{UploadHandle}} init ever vary, depending on the src? Intra-FS copies? * Right now, the utility doesn't offer an API to partition a file, or to create (bounded) {{InputStream}} args to {{putPart}}. Minor: {{MultipartUploader}} * IIRC {{ClassUtil::findContainingJar}} is expensive; guard using {{if (LOG.isDebugEnabled)}} * Might be worth warning if two {{MultipartUploader}} impls report the same scheme * The typical {{ServiceLoader}} pattern returns a factory object that produces instances, rather than matching the class and producing instances by reflection. This way, the factory instance can make some addtional checks and/or handle scheme collisions by chaining. It also avoids the {{initialize}} pattern, since the factory can invoke the cstr. The lookup is slower (scan of loaded uploaders, vs map lookup), but for a small number of instances. * Do {{putPart}} and {{complete}} need the {{filePath}} parameter? * Rename {{MultipartUploader}} to {{Uploader}}? {{BBPartHandle}} * While it's unlikely to be an issue, direct {{ByteBuffer}}s don't support {{array()}}. Is this to support {{Serializable}}}? > [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations > - > > Key: HDFS-13186 > URL: https://issues.apache.org/jira/browse/HDFS-13186 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs >Priority: Major > Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, > HDFS-13186.003.patch > > > To write files in parallel to an external storage system as in HDFS-12090, > there are two approaches: > # Naive approach: use a single datanode per file that copies blocks locally > as it streams data to the external service. This requires a copy for each > block inside the HDFS system and then a copy for the block to be sent to the > external system. > # Better approach: Single point (e.g. Namenode or SPS style external client) > and Datanodes coordinate in a multipart - multinode upload. > This system needs to work with multiple back ends and needs to coordinate > across the network. So we propose an API that resembles the following: > {code:java} > public UploadHandle multipartInit(Path filePath) throws IOException; > public PartHandle multipartPutPart(InputStream inputStream, > int partNumber, UploadHandle uploadId) throws IOException; > public void multipartComplete(Path filePath, > List > handles, > UploadHandle multipartUploadId) throws IOException;{code} > Here, UploadHandle and PartHandle are opaque handlers in the vein of > PathHandle so they can be serialized and deserialized in hadoop-hdfs project > without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle > and PartHandle. > In an object store such as S3A, the implementation is straight forward. In > the case of writing multipart/multinode to HDFS, we can write each block as a > file part. The complete call will perform a concat on the blocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode
[ https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448701#comment-16448701 ] Chris Douglas commented on HDFS-13283: -- [~lukmajercak], could you regenerate the patch? +1 on the changes > Percentage based Reserved Space Calculation for DataNode > > > Key: HDFS-13283 > URL: https://issues.apache.org/jira/browse/HDFS-13283 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, > HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, > HDFS-13283.005.patch, HDFS-13283.006.patch > > > Currently, the only way to configure reserved disk space for non-HDFS data on > a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be > an issue in non-heterogeneous clusters where size of DNs can differ. The > proposed solution is to allow percentage based configuration (and their > combination): > # ABSOLUTE > ** based on absolute number of reserved space > # PERCENTAGE > ** based on percentage of total capacity in the storage > # CONSERVATIVE > ** calculates both of the above and takes the one that will yield more > reserved space > # AGGRESSIVE > ** calculates 1. 2. and takes the one that will yield less reserved space > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.3 2.9.2 3.1.1 3.2.0 2.10.0 Status: Resolved (was: Patch Available) I committed this. Thanks [~surmountian] > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3 > > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10867) [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage
[ https://issues.apache.org/jira/browse/HDFS-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446483#comment-16446483 ] Chris Douglas commented on HDFS-10867: -- Looking through the patch for HDFS-13350, it looks like it's handling false positives where legacy blocks were treated as EC blocks, not the other way around. If we're already identifying legacy blocks correctly, then we need to avoid the same pitfall, but the rest of the code should be OK. > [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage > - > > Key: HDFS-10867 > URL: https://issues.apache.org/jira/browse/HDFS-10867 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Ewan Higgs >Priority: Major > Attachments: Block Bit Field Allocation of Provided Storage.pdf > > > We wish to design and implement the following related features for provided > storage: > # Dynamic mounting of provided storage within a Namenode (mount, unmount) > # Mount multiple provided storage systems on a single Namenode. > # Support updates to the provided storage system without having to regenerate > an fsimg. > A mount in the namespace addresses a corresponding set of block data. When > unmounted, any block data associated with the mount becomes invalid and > (eventually) unaddressable in HDFS. As with erasure-coded blocks, efficient > unmounting requires that all blocks with that attribute be identifiable by > the block management layer > In this subtask, we focus on changes and conventions to the block management > layer. Namespace operations are covered in a separate subtask. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads
[ https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446469#comment-16446469 ] Chris Douglas commented on HDFS-13272: -- Thanks, [~kihwal]. [~xkrogen], does this unblock you on HDFS-13265? > DataNodeHttpServer to have configurable HttpServer2 threads > --- > > Key: HDFS-13272 > URL: https://issues.apache.org/jira/browse/HDFS-13272 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > > In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 > threads. In addition to the possibility of this being too few threads, it is > much higher than necessary in resource constrained environments such as > MiniDFSCluster. To avoid compatibility issues, rather than using > {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new > configuration for the DataNode's thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446457#comment-16446457 ] Chris Douglas commented on HDFS-13408: -- For completeness, added v004 which considers a {{null}} base directory to the new cstr an error. > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: HDFS-13408.004.patch > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446454#comment-16446454 ] Chris Douglas commented on HDFS-13408: -- bq. there should be file read/write conflicts between test cases using the same base dir and test data file names, if running in parallel. OK. Perhaps we could [use the ID from surefire|https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html] to avoid these collisions. We still need a solution either for tests run sequentially, and this is a good workaround. bq. Regarding the warning for setting both approaches, I'm fine with it and I would even vote for an exception to avoid having inconsistencies Soright. Updated patch, which also rolls back the slf4j change. > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch, HDFS-13408.003.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: HDFS-13408.003.patch > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch, HDFS-13408.003.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13381) [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path
[ https://issues.apache.org/jira/browse/HDFS-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445356#comment-16445356 ] Chris Douglas commented on HDFS-13381: -- [~daryn]: to over-simplify [~rakeshr]'s description, the {{IntraSPSNameNodeFileIdCollector}} adds additional logic is for batching i.e., implementing {{FSTreeTraverser}} hooks, borrowing the pattern from encryption zones. The external/internal separation seems intact, as both {{FileCollector}} implementations are only accessible through their respective {{Context}}. This might be emphasized by removing the {{FileCollector}} interface, since it serves no particular purpose. Would that be sufficient? > [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path > > > Key: HDFS-13381 > URL: https://issues.apache.org/jira/browse/HDFS-13381 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R >Priority: Major > Attachments: HDFS-13381-HDFS-10285-00.patch, > HDFS-13381-HDFS-10285-01.patch > > > This Jira task will address the following comments: > # Use DFSUtilClient::makePathFromFileId, instead of generics(one for string > path and another for inodeId) like today. > # Only the context impl differs for external/internal sps. Here, it can > simply move FileCollector and BlockMoveTaskHandler to Context interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode
[ https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445289#comment-16445289 ] Chris Douglas commented on HDFS-13283: -- Wrong syntax for the inner class in {{hdfs-default.xml}}. Fixed, trying again. > Percentage based Reserved Space Calculation for DataNode > > > Key: HDFS-13283 > URL: https://issues.apache.org/jira/browse/HDFS-13283 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, > HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, > HDFS-13283.005.patch > > > Currently, the only way to configure reserved disk space for non-HDFS data on > a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be > an issue in non-heterogeneous clusters where size of DNs can differ. The > proposed solution is to allow percentage based configuration (and their > combination): > # ABSOLUTE > ** based on absolute number of reserved space > # PERCENTAGE > ** based on percentage of total capacity in the storage > # CONSERVATIVE > ** calculates both of the above and takes the one that will yield more > reserved space > # AGGRESSIVE > ** calculates 1. 2. and takes the one that will yield less reserved space > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode
[ https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13283: - Attachment: HDFS-13283.005.patch > Percentage based Reserved Space Calculation for DataNode > > > Key: HDFS-13283 > URL: https://issues.apache.org/jira/browse/HDFS-13283 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, > HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, > HDFS-13283.005.patch > > > Currently, the only way to configure reserved disk space for non-HDFS data on > a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be > an issue in non-heterogeneous clusters where size of DNs can differ. The > proposed solution is to allow percentage based configuration (and their > combination): > # ABSOLUTE > ** based on absolute number of reserved space > # PERCENTAGE > ** based on percentage of total capacity in the storage > # CONSERVATIVE > ** calculates both of the above and takes the one that will yield more > reserved space > # AGGRESSIVE > ** calculates 1. 2. and takes the one that will yield less reserved space > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: HDFS-13408.002.patch > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, > HDFS-13408.002.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: HDFS-13408.002.patch > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: (was: HDFS-13408.002.patch) > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode
[ https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445046#comment-16445046 ] Chris Douglas commented on HDFS-13283: -- Rebased patch. Minor changes: * Renamed static, inner class {{ReservedSpaceCalculatorBuilder}} to {{Builder}} * Changed invalid spec to throw {{IllegalArgumentException}} instead of failing back to absolute * Changed {{ReservedSpaceCalculatorAggressive}} to extend {{ReservedSpaceCalculator}} rather than another, similar option * Moved created of {{DF}} into the {{Builder}} Missed [~virajith]'s comments on {{hdfs-default}}, +1 after those are addressed. > Percentage based Reserved Space Calculation for DataNode > > > Key: HDFS-13283 > URL: https://issues.apache.org/jira/browse/HDFS-13283 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, > HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch > > > Currently, the only way to configure reserved disk space for non-HDFS data on > a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be > an issue in non-heterogeneous clusters where size of DNs can differ. The > proposed solution is to allow percentage based configuration (and their > combination): > # ABSOLUTE > ** based on absolute number of reserved space > # PERCENTAGE > ** based on percentage of total capacity in the storage > # CONSERVATIVE > ** calculates both of the above and takes the one that will yield more > reserved space > # AGGRESSIVE > ** calculates 1. 2. and takes the one that will yield less reserved space > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode
[ https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13283: - Attachment: HDFS-13283.004.patch > Percentage based Reserved Space Calculation for DataNode > > > Key: HDFS-13283 > URL: https://issues.apache.org/jira/browse/HDFS-13283 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, > HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch > > > Currently, the only way to configure reserved disk space for non-HDFS data on > a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be > an issue in non-heterogeneous clusters where size of DNs can differ. The > proposed solution is to allow percentage based configuration (and their > combination): > # ABSOLUTE > ** based on absolute number of reserved space > # PERCENTAGE > ** based on percentage of total capacity in the storage > # CONSERVATIVE > ** calculates both of the above and takes the one that will yield more > reserved space > # AGGRESSIVE > ** calculates 1. 2. and takes the one that will yield less reserved space > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444963#comment-16444963 ] Chris Douglas commented on HDFS-13408: -- bq. we are changing to logger here; is it fine to do it here given we want to commit all the way to branch-2.9? Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is {{private}} and we use the slf4j bindings in branch-2 e.g., HDFS-15394... but it's not necessary, I just changed it by habit. Should I change it back? > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444963#comment-16444963 ] Chris Douglas edited comment on HDFS-13408 at 4/19/18 11:10 PM: bq. we are changing to logger here; is it fine to do it here given we want to commit all the way to branch-2.9? Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is {{private}} and we use the slf4j bindings in branch-2 e.g., HADOOP-15394... but it's not necessary, I just changed it by habit. Should I change it back? was (Author: chris.douglas): bq. we are changing to logger here; is it fine to do it here given we want to commit all the way to branch-2.9? Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is {{private}} and we use the slf4j bindings in branch-2 e.g., HDFS-15394... but it's not necessary, I just changed it by habit. Should I change it back? > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13408: - Attachment: HDFS-13408.001.patch > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory
[ https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444921#comment-16444921 ] Chris Douglas commented on HDFS-13408: -- Sorry for the delay. Would the failing tests also fail to run in parallel, even on non-Windows platforms? Is it clear why cleanup fails on Windows? [~surmountian]: just to clarify/document the problem this is solving, many tests rely on the default base directory by hard-coding the static {{MiniDFSCluster::getBaseDirectory}}, assuming it will match the base directory of the {{MiniDFSCluster}}. To avoid breaking tests outside the project that use this pattern, this JIRA proposes to add optional randomization to the {{Builder}}, then fix failing tests by enabling this option (e.g., HDFS-13336). Is that correct? Adding {{getRandomizedDfsBaseDir}} to the {{Builder}} requires the creator to retain a reference to the {{Builder}} or the {{Configuration}}. It might be cleaner to pass an optional base directory into the {{Builder}} cstr, which defaults to {{getBaseDirectory}}, then explicitly use {{GenericTestUtils::getRandomizedTempPath}} from tests. Attaching a sketch as v001, but +1 on v000 if you prefer it. bq. After creating the random DFS Base Dir you set that value to HDFS_MINIDFS_BASEDIR. What if we have already this value set? Should this be an error/warning? If the caller is asking that the base directory be both randomized and set explicitly, those directives contract each other. > MiniDFSCluster to support being built on randomized base directory > -- > > Key: HDFS-13408 > URL: https://issues.apache.org/jira/browse/HDFS-13408 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch > > > Generated files of MiniDFSCluster during test are not properly cleaned in > Windows, which fails all subsequent test cases using the same default > directory (Windows does not allow other processes to delete them). By > migrating to randomized base directories, the conflict of test path of test > cases will be avoided, even if they are running at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router
[ https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1637#comment-1637 ] Chris Douglas commented on HDFS-13469: -- {quote} It's really as simple as: the path /.reserved/.inodes/123 will access inode 123 which may really be /user/daryn/dir/mystuff. The nfs implementation relies on inode paths. HDFS-7878 cannot solve the problem because its an abstraction for a different purpose. {quote} Agreed, but HDFS-7878 is implemented using the inode path. If the router doesn't encode the namespace in the INodeID, then it could be a payload in a {{PathHandle}}. Right now, that covers only one case for the inode path i.e., {{open}}. [~elgoiri] likely raised it because we could add handling for other operations. That might recommend moving the logic from the client to the NN. The router could remap the INodeIDs to encode the namespace, but if the namespace is rebalanced, then INodeIDs could become invalid even when the destination still "exists". Clients' interpretation of the error as a deleted file would be incompatible. The router would need to partition the INodeID space among all the NNs. If a client persists an INodeID and expects it to remain valid- because that works for a single NN- that partitioning would need to account for every NN that had ever registered with the router. This might still be the better approach, since it has a much lower implementation cost and covers more cases, but it has tradeoffs. bq. For now, it might be good to throw an unsupported exception if we get accesses to an inode path. +1 > RBF: Support InodeID in the Router > -- > > Key: HDFS-13469 > URL: https://issues.apache.org/jira/browse/HDFS-13469 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The Namenode supports identifying files through inode identifiers. > Currently the Router does not handle this properly, we need to add this > functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13311) RBF: TestRouterAdminCLI#testCreateInvalidEntry fails on Windows
[ https://issues.apache.org/jira/browse/HDFS-13311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441670#comment-16441670 ] Chris Douglas commented on HDFS-13311: -- bq. Chris Douglas, should I move this to HADOOP? Sorry, missed this. In the future, renaming the JIRA to describe the change (and citing the failure it's fixing) would probably be easier to work with, but it's not critical. > RBF: TestRouterAdminCLI#testCreateInvalidEntry fails on Windows > --- > > Key: HDFS-13311 > URL: https://issues.apache.org/jira/browse/HDFS-13311 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Minor > Labels: RBF, windows > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3 > > Attachments: HDFS-13311.000.patch > > > The Windows runs show that TestRouterAdminCLI#testCreateInvalidEntry fail > with NPE: > {code} > [ERROR] > testCreateInvalidEntry(org.apache.hadoop.hdfs.server.federation.router.TestRouterAdminCLI) > Time elapsed: 0.008 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:529) > at > org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:568) > at > org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174) > at > org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouterAdminCLI.testCreateInvalidEntry(TestRouterAdminCLI.java:444) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads
[ https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441657#comment-16441657 ] Chris Douglas commented on HDFS-13272: -- Ping [~kihwal], [~xkrogen] > DataNodeHttpServer to have configurable HttpServer2 threads > --- > > Key: HDFS-13272 > URL: https://issues.apache.org/jira/browse/HDFS-13272 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > > In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 > threads. In addition to the possibility of this being too few threads, it is > much higher than necessary in resource constrained environments such as > MiniDFSCluster. To avoid compatibility issues, rather than using > {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new > configuration for the DataNode's thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router
[ https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441564#comment-16441564 ] Chris Douglas commented on HDFS-13469: -- HDFS-7878 didn't add an identifier to the payload that could be used for this, though [~jingzhao] [suggested|https://issues.apache.org/jira/browse/HDFS-7878?focusedCommentId=15468143=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15468143] the name service ID. It would be a good sanity check, even in clusters that don't run a router. > RBF: Support InodeID in the Router > -- > > Key: HDFS-13469 > URL: https://issues.apache.org/jira/browse/HDFS-13469 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The Namenode supports identifying files through inode identifiers. > Currently the Router does not handle this properly, we need to add this > functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10867) [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage
[ https://issues.apache.org/jira/browse/HDFS-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419934#comment-16419934 ] Chris Douglas commented on HDFS-10867: -- bq. Don't forget about legacy negative block ids that already aren't compatible with EC assumptions. You can't rob the upper bits of a block id. *nod* we saw HDFS-13350. Each partition increases the chances of a collision with a legacy ID (HDFS-898, particularly [~szetszwo]'s analysis). I'd stop short of considering partitioning as "robbing" the block ID namespace, though. We need a fix for EC blocks, and if our solution doesn't also cover this case, then it's probably incomplete. Since these features are only in 3.x, I would be comfortable adding a step to the upgrade guide that instructs users to run fsck and re-copy files with legacy block IDs (or never use these features). Of course, a tool or comprehensive solution would be better. When crossing a major version, we can stop supporting some cases. Requiring work from users to eliminate legacy block IDs as an ongoing concern... seems like an OK tradeoff, to me. > [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage > - > > Key: HDFS-10867 > URL: https://issues.apache.org/jira/browse/HDFS-10867 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Ewan Higgs >Priority: Major > Attachments: Block Bit Field Allocation of Provided Storage.pdf > > > We wish to design and implement the following related features for provided > storage: > # Dynamic mounting of provided storage within a Namenode (mount, unmount) > # Mount multiple provided storage systems on a single Namenode. > # Support updates to the provided storage system without having to regenerate > an fsimg. > A mount in the namespace addresses a corresponding set of block data. When > unmounted, any block data associated with the mount becomes invalid and > (eventually) unaddressable in HDFS. As with erasure-coded blocks, efficient > unmounting requires that all blocks with that attribute be identifiable by > the block management layer > In this subtask, we focus on changes and conventions to the block management > layer. Namespace operations are covered in a separate subtask. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13330) ShortCircuitCache#fetchOrCreate never retries
[ https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418209#comment-16418209 ] Chris Douglas commented on HDFS-13330: -- bq. Could you also fix the findbugs error? In this case, the findbugs warning is an artifact of the infinite loop. It infers (correctly) that the only way to exit the loop is for {{info}} to be assigned a non-null value. As you point out, it should also exit the loop before assigning {{info}} when {{replicaInfoMap.get(key)}} returns null, as [above|https://issues.apache.org/jira/browse/HDFS-13330?focusedCommentId=16412164=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16412164]. Since this retry code has never been enabled, this functionality is either unused, reliable "enough", or some other layer implements the retry logic. Changing this to a {{for}} loop with a low, fixed number of retries (i.e., 2, 3 at the most) is probably sufficient. We'd like to know if/how it's used in applications, but absent that analysis, fixing it to work "as designed" is probably the best we're going to do. bq. In addition, a test case for this method is greatly appreciated. +1 The retry semantics of the committed code were clearly untested. > ShortCircuitCache#fetchOrCreate never retries > - > > Key: HDFS-13330 > URL: https://issues.apache.org/jira/browse/HDFS-13330 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > Attachments: HDFS-13330.001.patch, HDFS-13330.002.patch > > > The follow do .. while(false) loop seems useless to me. The code intended to > retry but it never worked. Let's fix it. > {code:java:title=ShortCircuitCache#fetchOrCreate} > ShortCircuitReplicaInfo info = null; > do { > if (closed) { > LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", > this, key); > return null; > } > Waitable waitable = replicaInfoMap.get(key); > if (waitable != null) { > try { > info = fetch(key, waitable); > } catch (RetriableException e) { > LOG.debug("{}: retrying {}", this, e.getMessage()); > } > } > } while (false);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads
[ https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416513#comment-16416513 ] Chris Douglas commented on HDFS-13272: -- bq. Rather than making it configurable, it might be better to simply reduce it. This makes sense to branch-3.x. For older branches, there still might be hftp users Reducing the default makes sense for both 3.x and branch-2. [~kihwal], to be clear, should it be configurable on branch-2 so it can be raised for hftp users, but dropped to the minimum for 3.x? > DataNodeHttpServer to have configurable HttpServer2 threads > --- > > Key: HDFS-13272 > URL: https://issues.apache.org/jira/browse/HDFS-13272 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > > In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 > threads. In addition to the possibility of this being too few threads, it is > much higher than necessary in resource constrained environments such as > MiniDFSCluster. To avoid compatibility issues, rather than using > {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new > configuration for the DataNode's thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13336) Test cases of TestWriteToReplica failed in windows
[ https://issues.apache.org/jira/browse/HDFS-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416372#comment-16416372 ] Chris Douglas commented on HDFS-13336: -- bq. a generic fix could be made for org.apache.hadoop.hdfs.MiniDFSCluster#determineDfsBaseDir, if by default a randomized temp path is returned This sounds more sustainable to me, not just for Windows compatibility, but to avoid tests that assume a fixed, default base dir. > Test cases of TestWriteToReplica failed in windows > -- > > Key: HDFS-13336 > URL: https://issues.apache.org/jira/browse/HDFS-13336 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13336.000.patch, HDFS-13336.001.patch, > HDFS-13336.002.patch > > > Test cases of TestWriteToReplica failed in windows with errors like: > h4. > !https://builds.apache.org/static/fc5100d0/images/16x16/document_delete.png! > Error Details > Could not fully delete > F:\short\hadoop-trunk-win\s\hadoop-hdfs-project\hadoop-hdfs\target\test\data\1\dfs\name-0-1 > h4. > !https://builds.apache.org/static/fc5100d0/images/16x16/document_delete.png! > Stack Trace > java.io.IOException: Could not fully delete > F:\short\hadoop-trunk-win\s\hadoop-hdfs-project\hadoop-hdfs\target\test\data\1\dfs\name-0-1 > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1011) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:932) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:864) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:497) at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:456) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica.testAppend(TestWriteToReplica.java:89) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at > org.junit.runners.ParentRunner.run(ParentRunner.java:309) at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414562#comment-16414562 ] Chris Douglas commented on HDFS-8979: - bq. The patch did not break HDFS-13330, if you understand the problem of that retry logic correctly It did not break the retry loop, but the fix to the checkstyle warning missed the actual problem. Checkstyle warnings aren't only cosmetic. Resolving HDFS-8979 and HDFS-13330, we would remove the signal from checkstyle where we had a real bug. Bulk fixes make it difficult for reviewers to find cases like HDFS-13330 that deserve closer attention. bq. Unless someone volunteers to review such huge patch again, I am inclined to revert this patch from all branches. We're more likely to introduce new bugs by reverting this. It's a large patch, covering a large swath of the codebase, committed more than two years ago. Most of the fixes are correct. If we're going to take a lesson from this, then we should probably avoid backporting bulk fixes from static analysis tools. While it can make subsequent backports smoother by keeping the code similar, the analysis only applies to the branch on which the tool is run e.g., if it can prove something dead in trunk, that doesn't mean it's dead in the backported code. > Clean up checkstyle warnings in hadoop-hdfs-client module > - > > Key: HDFS-8979 > URL: https://issues.apache.org/jira/browse/HDFS-8979 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Major > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch, > HDFS-8979.002.patch > > > This jira tracks the effort of cleaning up checkstyle warnings in > {{hadoop-hdfs-client}} module. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13330) Clean up dead code
[ https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414493#comment-16414493 ] Chris Douglas commented on HDFS-13330: -- bq. Instead of fixing this jira, I am more inclined to revert HDFS-8979 which already broke 2 other functionalities. To be fair to HDFS-8979, it didn't break the retry functionality. The retry loop never worked. Static analysis tools are identifying dead code, and these JIRAs are removing it instead of asking why it's dead. In this case, the code assumed that in do/while loops, the loop expression is not evaluated after a {{continue}}, which is [not the case|https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.13.1]. Whether we fix the retry logic in this JIRA or we open a new one, I think we can agree that simply removing the dead code isn't making meaningful progress. > Clean up dead code > -- > > Key: HDFS-13330 > URL: https://issues.apache.org/jira/browse/HDFS-13330 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HDFS-13330.001.patch > > > The follow do .. while(false) loop seems useless to me. > {code:java:title=ShortCircuitCache#fetchOrCreate} > ShortCircuitReplicaInfo info = null; > do { > if (closed) { > LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", > this, key); > return null; > } > Waitable waitable = replicaInfoMap.get(key); > if (waitable != null) { > try { > info = fetch(key, waitable); > } catch (RetriableException e) { > LOG.debug("{}: retrying {}", this, e.getMessage()); > } > } > } while (false);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13330) Clean up dead code
[ https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412165#comment-16412165 ] Chris Douglas commented on HDFS-13330: -- Which is to say: we should either remove the retry logic entirely or fix it. The dead code is a symptom of broken functionality. > Clean up dead code > -- > > Key: HDFS-13330 > URL: https://issues.apache.org/jira/browse/HDFS-13330 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Trivial > Labels: newbie > Attachments: HDFS-13330.001.patch > > > The follow do .. while(false) loop seems useless to me. > {code:java} > ShortCircuitReplicaInfo info = null; > do { > if (closed) { > LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", > this, key); > return null; > } > Waitable waitable = replicaInfoMap.get(key); > if (waitable != null) { > try { > info = fetch(key, waitable); > } catch (RetriableException e) { > LOG.debug("{}: retrying {}", this, e.getMessage()); > } > } > } while (false);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13330) Clean up dead code
[ https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412164#comment-16412164 ] Chris Douglas commented on HDFS-13330: -- It looks like the original intent was a retry loop, to repeat operations that throw a {{RetriableException}}. The {{continue}} was removed in HDFS-8979, which makes sense, because it [doesn't work as intended|https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.13.1]. The intended logic appears to be something like: {code:java} while (true) { if (closed) { LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", this, key); return null; } Waitable waitable = replicaInfoMap.get(key); if (waitable == null) { break; } try { info = fetch(key, waitable); } catch (RetriableException e) { LOG.debug("{}: retrying {}", this, e.getMessage()); continue; } break; } {code} > Clean up dead code > -- > > Key: HDFS-13330 > URL: https://issues.apache.org/jira/browse/HDFS-13330 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Trivial > Labels: newbie > Attachments: HDFS-13330.001.patch > > > The follow do .. while(false) loop seems useless to me. > {code:java} > ShortCircuitReplicaInfo info = null; > do { > if (closed) { > LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", > this, key); > return null; > } > Waitable waitable = replicaInfoMap.get(key); > if (waitable != null) { > try { > info = fetch(key, waitable); > } catch (RetriableException e) { > LOG.debug("{}: retrying {}", this, e.getMessage()); > } > } > } while (false);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13330) Clean up dead code
[ https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410304#comment-16410304 ] Chris Douglas commented on HDFS-13330: -- Haven't looked at the context, but is there a missing {{continue}} in the catch? > Clean up dead code > -- > > Key: HDFS-13330 > URL: https://issues.apache.org/jira/browse/HDFS-13330 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Trivial > Labels: newbie > > The follow do .. while(false) loop seems useless to me. > {code:java} > ShortCircuitReplicaInfo info = null; > do { > if (closed) { > LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.", > this, key); > return null; > } > Waitable waitable = replicaInfoMap.get(key); > if (waitable != null) { > try { > info = fetch(key, waitable); > } catch (RetriableException e) { > LOG.debug("{}: retrying {}", this, e.getMessage()); > } > } > } while (false);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption
[ https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1640#comment-1640 ] Chris Douglas commented on HDFS-13265: -- I'm +1 on the patch, FWIW. If HADOOP-15311 will be in branch-2, then we should try to apply these in a consistent order and backport it first. If not, then I'll commit these as they are. > MiniDFSCluster should set reasonable defaults to reduce resource consumption > > > Key: HDFS-13265 > URL: https://issues.apache.org/jira/browse/HDFS-13265 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, test >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13265-branch-2.000.patch, > HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, > TestMiniDFSClusterThreads.java > > > MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many > of these are not suitable for a unit test environment. For example, the > default handler thread count of 10 is definitely more than necessary for > (almost?) any unit test. We should set reasonable, lower defaults unless a > test specifically requires more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13296) GenericTestUtils#getTempPath and GenericTestUtils#getRandomizedTestDir generate paths with drive letter in Windows, and fail webhdfs related test cases
[ https://issues.apache.org/jira/browse/HDFS-13296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403121#comment-16403121 ] Chris Douglas commented on HDFS-13296: -- bq. Chris Douglas, Subru Krishnan, any idea on how we can trigger HDFS unit tests and not only Commons? Sorry, not that I know of. Finding unit tests that use this utility class and running those manually is probably a sufficient spot check. We can fix regressions if they come up. > GenericTestUtils#getTempPath and GenericTestUtils#getRandomizedTestDir > generate paths with drive letter in Windows, and fail webhdfs related test > cases > --- > > Key: HDFS-13296 > URL: https://issues.apache.org/jira/browse/HDFS-13296 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > Attachments: HDFS-13296.000.patch > > > In GenericTestUtils#getRandomizedTestDir, getAbsoluteFile is called and will > added drive letter to the path in windows, some test cases use the generated > path to send webhdfs request, which will fail due to the drive letter in the > URI like: "webhdfs://127.0.0.1:18334/D:/target/test/data/vUqZkOrBZa/test" > GenericTestUtils#getTempPath has the similar issue in Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption
[ https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401256#comment-16401256 ] Chris Douglas commented on HDFS-13265: -- Excellent, thanks [~xkrogen]. Skimming the commit log, does HADOOP-13597 mean we should not backport HDFS-15311 to branch-2 before committing this? > MiniDFSCluster should set reasonable defaults to reduce resource consumption > > > Key: HDFS-13265 > URL: https://issues.apache.org/jira/browse/HDFS-13265 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, test >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13265-branch-2.000.patch, > HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, > TestMiniDFSClusterThreads.java > > > MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many > of these are not suitable for a unit test environment. For example, the > default handler thread count of 10 is definitely more than necessary for > (almost?) any unit test. We should set reasonable, lower defaults unless a > test specifically requires more. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11600) Refactor TestDFSStripedOutputStreamWithFailure test classes
[ https://issues.apache.org/jira/browse/HDFS-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401207#comment-16401207 ] Chris Douglas commented on HDFS-11600: -- Thanks, [~Sammi]! > Refactor TestDFSStripedOutputStreamWithFailure test classes > --- > > Key: HDFS-11600 > URL: https://issues.apache.org/jira/browse/HDFS-11600 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.0.0-alpha2 >Reporter: Andrew Wang >Assignee: SammiChen >Priority: Minor > Fix For: 3.1.0, 3.0.2 > > Attachments: HDFS-11600-1.patch, HDFS-11600.002.patch, > HDFS-11600.003.patch, HDFS-11600.004.patch, HDFS-11600.005.patch, > HDFS-11600.006.patch, HDFS-11600.007.patch > > > TestDFSStripedOutputStreamWithFailure has a great number of subclasses. The > tests are parameterized based on the name of these subclasses. > Seems like we could parameterize these tests with JUnit and then not need all > these separate test classes. > Another note, the tests will randomly return instead of running the test. > Using {{Assume}} instead would make it more clear in the test output that > these tests were skipped. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13232) RBF: ConnectionPool should return first usable connection
[ https://issues.apache.org/jira/browse/HDFS-13232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401200#comment-16401200 ] Chris Douglas commented on HDFS-13232: -- bq. I committed HDFS-13230 but I messed up the message and committed it as HDFS-13232, any action here? Sorry for the delay. Other than reverting and recommitting, no. We'd need to file a ticket with INFRA to unlock the branch and rewrite history. Probably not worth it. > RBF: ConnectionPool should return first usable connection > - > > Key: HDFS-13232 > URL: https://issues.apache.org/jira/browse/HDFS-13232 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wei Yan >Assignee: Ekanth S >Priority: Minor > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.2, 3.2.0 > > Attachments: HDFS-13232.001.patch, HDFS-13232.002.patch, > HDFS-13232.003.patch > > > In current ConnectionPool.getConnection(), it will return the first active > connection: > {code:java} > for (int i=0; iint index = (threadIndex + i) % size; > conn = tmpConnections.get(index); > if (conn != null && !conn.isUsable()) { > return conn; > } > } > {code} > Here "!conn.isUsable()" should be "conn.isUsable()". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12422) Replace DataNode in Pipeline when waiting for Last Packet fails
[ https://issues.apache.org/jira/browse/HDFS-12422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401170#comment-16401170 ] Chris Douglas commented on HDFS-12422: -- bq. do you know anybody fit for reviewing this? [~shv], if he has cycles. > Replace DataNode in Pipeline when waiting for Last Packet fails > --- > > Key: HDFS-12422 > URL: https://issues.apache.org/jira/browse/HDFS-12422 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Major > Labels: hdfs > Attachments: HDFS-12422.001.patch, HDFS-12422.002.patch > > > # Create a file with replicationFactor = 4, minReplicas = 2 > # Fail waiting for the last packet, followed by 2 exceptions when recovering > the leftover pipeline > # The leftover pipeline will only have one DN and NN will never close such > block, resulting in failure to write > The block will stay there forever, unable to be replicated, ultimately going > missing. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12780) Fix spelling mistake in DistCpUtils.java
[ https://issues.apache.org/jira/browse/HDFS-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397468#comment-16397468 ] Chris Douglas commented on HDFS-12780: -- Failure is unrelated. {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.8.1:deploy (default-deploy) on project hadoop-cloud-storage-project: Failed to deploy artifacts: Could not transfer artifact org.apache.hadoop:hadoop-yarn-server-timeline-pluginstorage:jar:3.2.0-20180313.183440-186 from/to apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots): repository.apache.org: No address associated with hostname -> [Help 1] {noformat} > Fix spelling mistake in DistCpUtils.java > > > Key: HDFS-12780 > URL: https://issues.apache.org/jira/browse/HDFS-12780 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang >Priority: Major > Labels: patch > Fix For: 3.2.0 > > Attachments: HDFS-12780.patch > > > We found a spelling mistake in DistCpUtils.java. "* If checksums's can't be > retrieved," should be " * If checksums can't be retrieved," -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12780) Fix spelling mistake in DistCpUtils.java
[ https://issues.apache.org/jira/browse/HDFS-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-12780: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.0 Status: Resolved (was: Patch Available) +1 I committed this. Thanks, [~jiangjianfei] > Fix spelling mistake in DistCpUtils.java > > > Key: HDFS-12780 > URL: https://issues.apache.org/jira/browse/HDFS-12780 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang >Priority: Major > Labels: patch > Fix For: 3.2.0 > > Attachments: HDFS-12780.patch > > > We found a spelling mistake in DistCpUtils.java. "* If checksums's can't be > retrieved," should be " * If checksums can't be retrieved," -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13273) Fix compilation issue in trunk
[ https://issues.apache.org/jira/browse/HDFS-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397356#comment-16397356 ] Chris Douglas commented on HDFS-13273: -- Er... sorry, mistook this for HDFS-12677 > Fix compilation issue in trunk > -- > > Key: HDFS-13273 > URL: https://issues.apache.org/jira/browse/HDFS-13273 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: HDFS-13273.00.patch > > > [ERROR] > /home/jenkins/jenkins-slave/workspace/Hadoop-trunk-Commit/source/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java:[40,8] > class TestFileStatusWithDefaultECPolicy is public, should be declared in a > file named TestFileStatusWithDefaultECPolicy.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13273) Fix compilation issue in trunk
[ https://issues.apache.org/jira/browse/HDFS-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397336#comment-16397336 ] Chris Douglas commented on HDFS-13273: -- +1 Looks like my mistake applying the patch. I ran the tests prior to commit... > Fix compilation issue in trunk > -- > > Key: HDFS-13273 > URL: https://issues.apache.org/jira/browse/HDFS-13273 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: HDFS-13273.00.patch > > > [ERROR] > /home/jenkins/jenkins-slave/workspace/Hadoop-trunk-Commit/source/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java:[40,8] > class TestFileStatusWithDefaultECPolicy is public, should be declared in a > file named TestFileStatusWithDefaultECPolicy.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13271) WebHDFS: Add constructor in SnapshottableDirectoryStatus with HdfsFileStatus as argument
[ https://issues.apache.org/jira/browse/HDFS-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-13271: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 Status: Resolved (was: Patch Available) +1 I committed this. Thanks, [~ljain] > WebHDFS: Add constructor in SnapshottableDirectoryStatus with HdfsFileStatus > as argument > > > Key: HDFS-13271 > URL: https://issues.apache.org/jira/browse/HDFS-13271 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Fix For: 3.1.0 > > Attachments: HDFS-13271.001.patch, HDFS-13271.002.patch > > > This jira aims to add a constructor in SnapshottableDirectoryStatus which > takes HdfsFileStatus as a argument. This constructor will be used in > JsonUtilClient#toSnapshottableDirectoryStatus for creating a > SnapshottableDirectoryStatus object. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13141) WebHDFS: Add support for getting snasphottable directory list
[ https://issues.apache.org/jira/browse/HDFS-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396507#comment-16396507 ] Chris Douglas commented on HDFS-13141: -- bq. After check HdfsFileStatus, it seems we don't have a getter for flags for HdfsFileStatus. Is this something we miss from HDFS-12681? It was deliberate. The flags support existing methods on {{HdfsFileStatus}} and {{FileStatus}}. Conversion between the enums directly through the type implies a tighter binding than I hoped we'd maintain going forward. While {{FileStatus}} should expose only metadata common to most implementations, {{HdfsFileStatus}} can specialize on HDFS. Instead of extracting fields from one {{HdfsFileStatus}} to construct another, could {{SnapshottableDirectoryStatus}} just take the original struct, to avoid the {{convert}} method? > WebHDFS: Add support for getting snasphottable directory list > - > > Key: HDFS-13141 > URL: https://issues.apache.org/jira/browse/HDFS-13141 > Project: Hadoop HDFS > Issue Type: Task > Components: webhdfs >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Fix For: 3.1.0, 3.2.0 > > Attachments: HDFS-13141.001.patch, HDFS-13141.002.patch, > HDFS-13141.003.patch > > > This Jira aims to implement get snapshottable directory list operation for > webHdfs filesystem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12677) Extend TestReconstructStripedFile with a random EC policy
[ https://issues.apache.org/jira/browse/HDFS-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-12677: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 Status: Resolved (was: Patch Available) I committed this. Thanks [~tasanuma0829] for the patch and [~ekanth] for the revision. [~ekanth] I apologize, the commit message should have cited both of you. > Extend TestReconstructStripedFile with a random EC policy > - > > Key: HDFS-12677 > URL: https://issues.apache.org/jira/browse/HDFS-12677 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Fix For: 3.1.0 > > Attachments: HDFS-12677.002.patch, HDFS-12677.1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10388) DistributedFileSystem#getStatus returns a misleading FsStatus
[ https://issues.apache.org/jira/browse/HDFS-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-10388: - Resolution: Invalid Status: Resolved (was: Patch Available) This appears to be working as designed, since HDFS doesn't support partitions. I'd expect this is also correct in RBF scenarios. [~elgoiri], do you want to check? [~linyiqun], please reopen if this remains an issue > DistributedFileSystem#getStatus returns a misleading FsStatus > - > > Key: HDFS-10388 > URL: https://issues.apache.org/jira/browse/HDFS-10388 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin >Priority: Major > Attachments: HDFS-10388.001.patch > > > The method {{DistributedFileSystem#getStatus}} returns the dfs's disk status. > {code} >public FsStatus getStatus(Path p) throws IOException { > statistics.incrementReadOps(1); > return dfs.getDiskStatus(); >} > {code} > So the param path is no meaning here. And the object returned will mislead > for users to use this method. I looked into the code, only when the file > system has multiple partitions, the use and capacity of the partition pointed > to by the specified path will be reflected. For example, in the subclass > {{RawLocalFileSystem}}, it will be return correctly. > We should return a new meaningless FsStatus here (like new FsStatus(0, 0, 0)) > and indicate that the invoked method isn't available in > {{DistributedFileSystem}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10923) Make InstrumentedLock require ReentrantLock
[ https://issues.apache.org/jira/browse/HDFS-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-10923: - Status: Open (was: Patch Available) This no longer applies. [~arpitagarwal], could you regenerate the patch? > Make InstrumentedLock require ReentrantLock > --- > > Key: HDFS-10923 > URL: https://issues.apache.org/jira/browse/HDFS-10923 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal >Priority: Major > Attachments: HDFS-10923.01.patch, HDFS-10923.02.patch, > HDFS-10923.03.patch > > > Make InstrumentedLock use ReentrantLock instead of Lock, so nested > acquire/release calls can be instrumented correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11168) Bump Netty 4 version
[ https://issues.apache.org/jira/browse/HDFS-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-11168: - Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as duplicate, but the lineage is confusing. HDFS-11498 dropped the version from 4.1.0.Beta5 to 4.0.23.Final, then it was bumped to 4.0.52.Final in HDFS-14910. If we revisit the version of netty we're shipping, it should probably be in a new JIRA > Bump Netty 4 version > > > Key: HDFS-11168 > URL: https://issues.apache.org/jira/browse/HDFS-11168 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Attachments: h11168_20161122.patch, h11168_20161123.patch, > h11168_20161123b.patch > > > The current Netty 4 version is 4.1.0.Beta5. We should bump it to a non-beta > version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13216) HDFS
[ https://issues.apache.org/jira/browse/HDFS-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas resolved HDFS-13216. -- Resolution: Invalid > HDFS > > > Key: HDFS-13216 > URL: https://issues.apache.org/jira/browse/HDFS-13216 > Project: Hadoop HDFS > Issue Type: Task >Reporter: mster >Priority: Trivial > Labels: INodesInPath > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-13216) HDFS
[ https://issues.apache.org/jira/browse/HDFS-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas reopened HDFS-13216: -- > HDFS > > > Key: HDFS-13216 > URL: https://issues.apache.org/jira/browse/HDFS-13216 > Project: Hadoop HDFS > Issue Type: Task >Reporter: mster >Priority: Trivial > Labels: INodesInPath > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13215) RBF: Move Router to its own module
[ https://issues.apache.org/jira/browse/HDFS-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382888#comment-16382888 ] Chris Douglas commented on HDFS-13215: -- Breaking server-side code into multiple modules is less useful. IIRC, RBF is mostly in its own package, and since we upgraded surefire it's easier to run subsets of tests using its regex syntax. But sure, it's mostly stand-alone, and might as well live in its own package. If tools like the rebalancer take a dependency on this package to customize their behavior in federated environments, those may have to move into a separate package to avoid the circular dependency. > RBF: Move Router to its own module > -- > > Key: HDFS-13215 > URL: https://issues.apache.org/jira/browse/HDFS-13215 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > We are splitting the HDFS client code base and potentially Router-based > Federation is also independent enough to be in its own package. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13119) RBF: Manage unavailable clusters
[ https://issues.apache.org/jira/browse/HDFS-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370517#comment-16370517 ] Chris Douglas commented on HDFS-13119: -- bq. As this is technically a bug, I'd like to push it for 2.9.1 and 3.0.1 (or 3.0.2). There's a vote for 3.0.1 in progress, but you can contact the release manager ([~eddyxu]) in case he rolls another RC. bq. Any idea what's the current state with the branches? My guess is branch-2.9 and branch-3.0. AFAIK: trunk -> 3.2 3.1.0 -> branch-3.1 3.0.2 -> branch-3.0 3.0.1 -> branch-3.0.1 2.10 -> branch-2 2.9.1 -> branch-2.9 > RBF: Manage unavailable clusters > > > Key: HDFS-13119 > URL: https://issues.apache.org/jira/browse/HDFS-13119 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Íñigo Goiri >Assignee: Yiqun Lin >Priority: Major > Labels: RBF > Fix For: 3.1.0, 2.10.0, 3.2.0 > > Attachments: HDFS-13119.001.patch, HDFS-13119.002.patch, > HDFS-13119.003.patch, HDFS-13119.004.patch, HDFS-13119.005.patch, > HDFS-13119.006.patch > > > When a federated cluster has one of the subcluster down, operations that run > in every subcluster ({{RouterRpcClient#invokeAll()}}) may take all the RPC > connections. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter
[ https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359032#comment-16359032 ] Chris Douglas commented on HDFS-13123: -- bq. hard linking across block pools as one option and even tiered storage Yes, [~virajith] and I wrote a prototype of this with an intern for a similar project. Making it a proper transaction is complicated, but architecturally RBF is in the right place to coordinate this cleanly. We added APIs to generate and attach an FSImage for a NN subtree. Attaching an image required reallocating not only the inodeIds but also the blockIds, which were hardlinked into a contiguous range in the destination blockId space. We didn't solve all the failover and edit log cases, but these seem tractable as long as the subtree is immutable. Without that assumption (which RBF can enforce/detect) thar be dragons. > RBF: Add a balancer tool to move data across subsluter > --- > > Key: HDFS-13123 > URL: https://issues.apache.org/jira/browse/HDFS-13123 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Major > Attachments: HDFS Router-Based Federation Rebalancer.pdf > > > Follow the discussion in HDFS-12615. This Jira is to track effort for > building a rebalancer tool, used by router-based federation to move data > among subclusters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13099) RBF: Refactor RBF relevant settings
[ https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351201#comment-16351201 ] Chris Douglas commented on HDFS-13099: -- It's not my favorite pattern. That said, 1) it's a project convention, 2) it's a consistent way to handle deprecation, and 3) unit tests aid consistency with {{-default.xml}} config files. If this is purely for documentation then it could assign the fields in {{DFSConfigKeys}}/{{DFSRouterKeys}} (or whatever) using the fields scoped to each class. That said, IIRC javadoc reorders the fields, so its virtues as documentation are pretty limited unless your field names follow a schema. I'm ambivalent. The documentation for RBF and the impl docs are probably the better references, so I'd be tempted to leave it as-is. > RBF: Refactor RBF relevant settings > --- > > Key: HDFS-13099 > URL: https://issues.apache.org/jira/browse/HDFS-13099 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Affects Versions: 3.0.0 >Reporter: Yiqun Lin >Assignee: Yiqun Lin >Priority: Minor > > Currently the State Store Driver relevant settings only written in its > implement classes. > {noformat} > public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl { > ... > /** Configuration keys. */ > public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX = > DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk."; > public static final String FEDERATION_STORE_ZK_PARENT_PATH = > FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path"; > public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT = > "/hdfs-federation"; > .. > {noformat} > Actually, they should be moved into class {{DFSConfigKeys}} and documented in > file {{hdfs-default.xml}}. This will help more users know these settings and > know how to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329625#comment-16329625 ] Chris Douglas commented on HDFS-12990: -- bq. I believe a vote would make people think about this problem and respond according to their preferences That would involve creating a 3.0.1 release with this change and voting on it. > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328025#comment-16328025 ] Chris Douglas commented on HDFS-12990: -- [~eyang], that paragraph begs the question. Paraphrasing: "we must retain the change, because we can only work around the consequences." In every example you cite, the project and users got some benefit. Reverting this change _is_ an admissible solution to fix the bugs it caused. Our releases solve problems for our users. We're not here to curate a version control system and a set of compatibility rules. > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325458#comment-16325458 ] Chris Douglas commented on HDFS-12990: -- We wrote the compatibility guidelines _to avoid breaking users_. If there was any benefit to this change, then we could discuss tradeoffs, but there are none. By retaining this change, we choose to add a bunch of tedious, rote repairs to sites trying to upgrade from 2.x. It's not challenging to work around this, but it's annoying and avoidable. bq. Maybe we should look at NN RPC port as a new feature to ensure the down stream projects really tested with Hadoop 3 instead of gambling on compatibility. That's a creative way to look at it, but changing the NN port doesn't achieve that in any meaningful sense. Even if this does require changes, they are superficial. More to the point, we have *no* interest in second-guessing other projects' testing practices, or challenging whether they "really certified with Hadoop 3". That's not merely outside our charter, it is antithetical to it. bq. This is greatest challenge to Hadoop policy. bq. I know that forever compatibility is not sustainable and it only cost more in the long run. We should move on and do something better for Hadoop. Are we looking at the same JIRA? This is not like moving to YARN, which required a lot of work to shake out its incompatible changes, but we won a better architecture by it. This upends a 10+ year convention, the project and its users gain _nothing_ by it, and it is trivial to fix. > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324589#comment-16324589 ] Chris Douglas commented on HDFS-12990: -- bq. Can we retract Hadoop 3.0.0 release and restore NN port to 8020 that we made a bad release? Can this be an alternate approach to produce Hadoop 3.0.1? No, that's not available. bq. why not fixing it in 4.0.0 and declare 3.x as dead? bq. Are there bugs in our current release procedure? bq. This is a good lesson to avoid time driven release for major version change. Nah. Even if 3.x were "time driven", [~vinodkv] [predicted|https://s.apache.org/KnCL] that incompatible changes would require incompatible fixes, and of course it happened. It _always_ happens, because we don't control when others try out the release. We don't know how people are using and extending Hadoop, so we curate compatibility guidelines and specifications to signal what's intended so applications avoid relying on incidental behavior. But our guidelines are no more a contract than they are a suicide pact. bq. Are we going to have other incompatible changes in the future minor and dot releases? What is the criteria to decide which incompatible changes are allowed? I'm sure there are more of these, and we'll remember this issue fondly for having such a straightforward fix. When an incompatible change is a mistake, it's a sunk cost. From there, we find the least harmful way to address it and move on. We created compatibility guidelines to prioritize users' needs over development expedience. Reverting the NN port change is consistent with that intent. We failed to follow our own policy when this change was committed, and absent a technical justification for this incompatible change, it is a bug. The current patch restores compatibility. This needs to converge. Absent a technical justification for the original change or the veto, and without any alternative approach, the current patch is the best remedy we have. +1 > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322838#comment-16322838 ] Chris Douglas commented on HDFS-12990: -- bq. No incompatible changes are allowed between 3.0.0 and 3.0.1. Dot releases only allow bug fixes. Won't you agree? Not to get too bogged down in semantics, but changing the NN port from 8020 to 9820 breaks downstream users for no benefit. We require some technical justification for breaking compatibility. If the new port were part of the contiguous range, or if the original port was in the ephemeral range, then arguments in favor of retaining this change would have technical merit. As this change lacks any justification, the original change was not a valid, incompatible change; it *is* a bug. Semantics aside, as [~daryn] pointed out on the dev list, this broke rolling upgrades. It broke applications. It broke deployments. We don't have to burn a major version to fix it, so let's just... fix it. > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size
[ https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319336#comment-16319336 ] Chris Douglas commented on HDFS-12802: -- +1 thanks [~elgoiri] > RBF: Control MountTableResolver cache size > -- > > Key: HDFS-12802 > URL: https://issues.apache.org/jira/browse/HDFS-12802 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, > HDFS-12802.002.patch, HDFS-12802.003.patch, HDFS-12802.004.patch, > HDFS-12802.005.patch > > > Currently, the {{MountTableResolver}} caches the resolutions for the > {{PathLocation}}. However, this cache can grow with no limits if there are a > lot of unique paths. Some of these cached resolutions might not be used at > all. > The {{MountTableResolver}} should clean the {{locationCache}} periodically. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size
[ https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317200#comment-16317200 ] Chris Douglas commented on HDFS-12802: -- bq. We have a read/write lock for syncing the tree and the locationCache so we are not taking much advantage of it. bq. The issue is actually the amount of proxied operations for unique files. It sounds like the performance isn't likely to be critical and concurrency is limited, so let's use whatever's simplest to implement and maintain. bq. Another thing, is that we may not even need the cache time limit, the size covers the use case. +1 this could use soft references in a {{ReferenceMap}}, which will purge entries in proportion to memory pressure. This could be simpler than adding a config knob for the size of the table. On the other hand, limiting the size of the table will also bound the cost of invalidations, and shrinking the heap to effect this (or resorting to esoteric gc knobs) is clearly worse. Again: probably not going to come up in practice, so whatever's simplest to maintain. bq. I don't think I can use Collection::removeIf on top of the guava cache; I would go for iterating over and remove the keys with the prefix I think it'd work, since modifications to the map returned by {{Cache::asMap}} directly affect the cache [[1|https://google.github.io/guava/releases/16.0/api/docs/com/google/common/cache/Cache.html#asMap()]], but it's just syntactic sugar. Anything that completes in one pass is fine. > RBF: Control MountTableResolver cache size > -- > > Key: HDFS-12802 > URL: https://issues.apache.org/jira/browse/HDFS-12802 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, > HDFS-12802.002.patch, HDFS-12802.003.patch > > > Currently, the {{MountTableResolver}} caches the resolutions for the > {{PathLocation}}. However, this cache can grow with no limits if there are a > lot of unique paths. Some of these cached resolutions might not be used at > all. > The {{MountTableResolver}} should clean the {{locationCache}} periodically. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12992) Ozone: Add support for Larger key sizes in Corona
[ https://issues.apache.org/jira/browse/HDFS-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317116#comment-16317116 ] Chris Douglas commented on HDFS-12992: -- Corona is a benchmark for Ozone? Could we pick a different name for this? It collides with an [existing project|https://www.facebook.com/notes/facebook-engineering/under-the-hood-scheduling-mapreduce-jobs-more-efficiently-with-corona/10151142560538920/]. > Ozone: Add support for Larger key sizes in Corona > - > > Key: HDFS-12992 > URL: https://issues.apache.org/jira/browse/HDFS-12992 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: HDFS-7240 > > > Corona provides an option to specify the key size for generating the data for > the key. This data is allocated in a single buffer. Allocating larger key > sizes can cause OOM as there is not sufficient memory to allocate the buffer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size
[ https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313902#comment-16313902 ] Chris Douglas commented on HDFS-12802: -- How frequently are entries invalidated and how large is the map? If the cleanup is only necessary after running for several weeks but invalidation (including successors) is relatively common, then the Guava implementation may not match the workload. Apache Commons has a {{ReferenceMap}} implementation that can use soft references; since router memory utilization grows slowly, that would fit this case. Unfortunately, it's not sorted (IIRC) so invalidation would still be expensive. If this does use the Guava {{Cache}}, sorting the keyset to extract the submap is more expensive than removing entries by a full scan. It could use {{Collection::removeIf}} to match a prefix of the keyset. We've been burned by Guava changing incompatibly in the past, but the {{Cache}} library has been relatively stable(?). The Guava impl is also threadsafe, so concurrent invalidations would not form a convoy. > RBF: Control MountTableResolver cache size > -- > > Key: HDFS-12802 > URL: https://issues.apache.org/jira/browse/HDFS-12802 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, > HDFS-12802.002.patch, HDFS-12802.003.patch > > > Currently, the {{MountTableResolver}} caches the resolutions for the > {{PathLocation}}. However, this cache can grow with no limits if there are a > lot of unique paths. Some of these cached resolutions might not be used at > all. > The {{MountTableResolver}} should clean the {{locationCache}} periodically. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285
[ https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308948#comment-16308948 ] Chris Douglas commented on HDFS-12911: -- A few questions about the Context API in [^HDFS-12911-HDFS-10285-02.patch]. In general, a few methods might not be cleanly implementable outside the NN: * {{readLockNS}}/{{readUnlockNS}} are sensitive APIs to expose to an external service. If the SPS dies, stalls, or loses connectivity to the NN without releasing the lock: what happens? Instead, would it be possible to break up {{analyseBlocksStorageMovementsAndAssignToDN}} into operations that- if any fail due to concurrent modification- could abort and retry? Alternatively (and as in the naive [^HDFS-12911.00.patch]), the server side of the RPC could be "heavier", by taking the lock and running this code server-side. This would counter many of the advantages to running it externally, unfortunately. * {{getSPSEngine}} may leak a heavier abstraction than is necessary. As used, it appears to check for liveness of the remote service, which can be a single call (i.e., instead of checking that both the SPS and namesystem are running, create a call that checks both in the context object), or it's used to get a reference to another component. The context could return those components directly, rather than returning the SPS. It would be cleaner still if the underlying action could be behind the context API, so the caller isn't managing multiple abstractions. * The {{BlockMoveTaskHandler}} and {{FileIDCollector}} abstract the "write" (scheduling moves) and "read" (scan namespace) interfaces? * {{addSPSHint}} has no callers in this patch? Is this future functionality, or only here for symmetry with {{removeSPSHint}}? * Are these calls idempotent? Or would an external provider need to resync with the NN on RPC timeouts? * Since they're stubs in this patch, could we move the external implementation to a separate JIRA? Just to keep the patch smaller. > [SPS]: Fix review comments from discussions in HDFS-10285 > - > > Key: HDFS-12911 > URL: https://issues.apache.org/jira/browse/HDFS-12911 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Uma Maheswara Rao G >Assignee: Rakesh R > Attachments: HDFS-12911-HDFS-10285-01.patch, > HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch > > > This is the JIRA for tracking the possible improvements or issues discussed > in main JIRA > So far comments to handle > Daryn: > # Lock should not kept while executing placement policy. > # While starting up the NN, SPS Xattrs checks happen even if feature > disabled. This could potentially impact the startup speed. > UMA: > # I am adding one more possible improvement to reduce Xattr objects > significantly. > SPS Xattr is constant object. So, we create one Xattr deduplication object > once statically and use the same object reference when required to add SPS > Xattr to Inode. So, here additional bytes required for storing SPS Xattr > would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr > overhead should come down significantly IMO. Lets explore the feasibility on > this option. > Xattr list Future will not be specially created for SPS, that list would have > been created by SetStoragePolicy already on the same directory. So, no extra > Feature creation because of SPS alone. > # Currently SPS putting long id objects in Q for tracking SPS called Inodes. > So, it is additional created and size of it would be (obj ref + value) = (8 + > 8) bytes [ ignoring alignment for time being] > So, the possible improvement here is, instead of creating new Long obj, we > can keep existing inode object for tracking. Advantage is, Inode object > already maintained in NN, so no new object creation is needed. So, we just > need to maintain one obj ref. Above two points should significantly reduce > the memory requirements of SPS. So, for SPS call: 8bytes for called inode > tracking + 8 bytes for Xattr ref. > # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will > reduce unnecessary Node creations inside LinkedList. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.
[ https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308907#comment-16308907 ] Chris Douglas commented on HDFS-12970: -- No worries, thanks for bringing it up. The historical baggage around these APIs has created some unintuitive, and occasionally bizarre patterns. > HdfsFileStatus#getPath returning null. > -- > > Key: HDFS-12970 > URL: https://issues.apache.org/jira/browse/HDFS-12970 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Rushabh S Shah >Priority: Critical > > After HDFS-12681, HdfsFileStatus#getPath() returns null. > I don't think this is expected. > Both the implementation of {{HdfsFileStatus}} sets the path to null. > {code:title=HdfsNamedFileStatus.java|borderStyle=solid} > HdfsNamedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, Set flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, -- The last null is for path. > HdfsFileStatus.convert(flags)); > {code} > {code:title=HdfsLocatedFileStatus.java|borderStyle=solid} > HdfsLocatedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, EnumSet flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy, > LocatedBlocks hdfsloc) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, HdfsFileStatus.convert(flags), -- The last > null on this line is for path. > null); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-12970) HdfsFileStatus#getPath returning null.
[ https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas resolved HDFS-12970. -- Resolution: Invalid > HdfsFileStatus#getPath returning null. > -- > > Key: HDFS-12970 > URL: https://issues.apache.org/jira/browse/HDFS-12970 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Rushabh S Shah >Priority: Critical > > After HDFS-12681, HdfsFileStatus#getPath() returns null. > I don't think this is expected. > Both the implementation of {{HdfsFileStatus}} sets the path to null. > {code:title=HdfsNamedFileStatus.java|borderStyle=solid} > HdfsNamedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, Set flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, -- The last null is for path. > HdfsFileStatus.convert(flags)); > {code} > {code:title=HdfsLocatedFileStatus.java|borderStyle=solid} > HdfsLocatedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, EnumSet flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy, > LocatedBlocks hdfsloc) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, HdfsFileStatus.convert(flags), -- The last > null on this line is for path. > null); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.
[ https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308752#comment-16308752 ] Chris Douglas commented on HDFS-12970: -- bq. While working on HDFS-12811, I was issuing FSNamesystem#getFileInfo call from NamenodeFsck and trying to extract path from HdfsFileStatus I see. This would be {{null}} even before HDFS-12681. bq. If I understand your previous comment correctly, I have to cast the returned file status to either HdfsNamedFileStatus or HdfsLocatedFileStatus explicitly ? The caller needs to invoke {{makeQualified(URI defaultUri, Path parent)}} on the result to set the path. > HdfsFileStatus#getPath returning null. > -- > > Key: HDFS-12970 > URL: https://issues.apache.org/jira/browse/HDFS-12970 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Rushabh S Shah >Priority: Critical > > After HDFS-12681, HdfsFileStatus#getPath() returns null. > I don't think this is expected. > Both the implementation of {{HdfsFileStatus}} sets the path to null. > {code:title=HdfsNamedFileStatus.java|borderStyle=solid} > HdfsNamedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, Set flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, -- The last null is for path. > HdfsFileStatus.convert(flags)); > {code} > {code:title=HdfsLocatedFileStatus.java|borderStyle=solid} > HdfsLocatedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, EnumSet flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy, > LocatedBlocks hdfsloc) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, HdfsFileStatus.convert(flags), -- The last > null on this line is for path. > null); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.
[ https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307033#comment-16307033 ] Chris Douglas commented on HDFS-12970: -- This is by design. {{HdfsFileStatus}} needs to be qualified before it can be returned to the user, since the NameNode doesn't return fully qualified URIs. Prior to HDFS-12681, it was qualified as it was copied into a {{FileStatus}} object. Since {{HdfsFileStatus}} is now a subtype of {{FileStatus}}, the path is initialized as null in the constructor and set later by the client. Are you observing the path is {{null}} when it's returned to the user? > HdfsFileStatus#getPath returning null. > -- > > Key: HDFS-12970 > URL: https://issues.apache.org/jira/browse/HDFS-12970 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Rushabh S Shah >Priority: Critical > > After HDFS-12681, HdfsFileStatus#getPath() returns null. > I don't think this is expected. > Both the implementation of {{HdfsFileStatus}} sets the path to null. > {code:title=HdfsNamedFileStatus.java|borderStyle=solid} > HdfsNamedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, Set flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, -- The last null is for path. > HdfsFileStatus.convert(flags)); > {code} > {code:title=HdfsLocatedFileStatus.java|borderStyle=solid} > HdfsLocatedFileStatus(long length, boolean isdir, int replication, > long blocksize, long mtime, long atime, > FsPermission permission, EnumSet flags, > String owner, String group, > byte[] symlink, byte[] path, long fileId, > int childrenNum, FileEncryptionInfo feInfo, > byte storagePolicy, ErasureCodingPolicy ecPolicy, > LocatedBlocks hdfsloc) { > super(length, isdir, replication, blocksize, mtime, atime, > HdfsFileStatus.convert(isdir, symlink != null, permission, flags), > owner, group, null, null, HdfsFileStatus.convert(flags), -- The last > null on this line is for path. > null); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306519#comment-16306519 ] Chris Douglas commented on HDFS-12915: -- Thanks, [~eddyxu] > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Chris Douglas > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306491#comment-16306491 ] Chris Douglas commented on HDFS-12915: -- [~eddyxu], [~Sammi], could you take a look at the patch? The findbugs warning has been flagged in every patch build for a few weeks. Please feel free to take this over, but we should commit a solution soon. > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305833#comment-16305833 ] Chris Douglas commented on HDFS-12915: -- Please don't hesitate to effect a different refactoring (e.g., removing {{blockType}}) if you prefer it to this patch. There's no reason to rewrite the same utility function multiple times. > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-12915: - Attachment: HDFS-12915.02.patch Added a check for replication policy set for {{STRIPED}} > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org