[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-06-21 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519841#comment-16519841
 ] 

Chris Douglas commented on HDFS-10285:
--

+1 on [~umamaheswararao]'s proposal. We can refine the code in trunk.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-10285-consolidated-merge-patch-04.patch, 
> HDFS-10285-consolidated-merge-patch-05.patch, 
> HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2018-06-17 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13186:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

I committed this. Thanks, [~ehiggs]

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2018-06-15 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514298#comment-16514298
 ] 

Chris Douglas commented on HDFS-13186:
--

v010 just fixes javadoc errors and added more param javadoc for 
{{MultipartUploader}}. I'm +1 on this and will commit if jenkins comes back 
clean.

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2018-06-15 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13186:
-
Attachment: HDFS-13186.010.patch

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2018-06-15 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13186:
-
Summary: [PROVIDED Phase 2] Multipart Uploader API  (was: [PROVIDED Phase 
2] Multipart Multinode uploader API + Implementations)

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-06-14 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13186:
-
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-06-11 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508610#comment-16508610
 ] 

Chris Douglas commented on HDFS-13186:
--

bq. Does the contract test for the local FS pass, after adding support for 
PathHandle?
Sorry, I meant the {{AbstractContractPathHandleTest}}, which verifies that a FS 
implements {{PathHandle}} correctly. v008 moves the {{createPathHandle}} code 
from {{LocalFileSystem}} to {{RawLocalFileSystem}}, since the contract test 
relies on {{append}} support, and adds the contract test. Unfortunately, the 
impl (at least without native support) only reports with second precision, so 
the contract test grew a {{sleep(1000)}}. v008 fixes the javac/checkstyle 
complaints on v007, but it likely adds others. It also slightly refactors the 
unit test, so we can run it against both HDFS and LocalFS.

If this looks OK to you (and Jenkins), I think it's good to commit.

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-06-11 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13186:
-
Attachment: HDFS-13186.008.patch

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-06-11 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508406#comment-16508406
 ] 

Chris Douglas commented on HDFS-13265:
--

bq. src/test/resources should only be for unit tests right?
Yes; sorry, I missed test vs main

> MiniDFSCluster should set reasonable defaults to reduce resource consumption
> 
>
> Key: HDFS-13265
> URL: https://issues.apache.org/jira/browse/HDFS-13265
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13265-branch-2.000.patch, 
> HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, HDFS-13265.001.patch, 
> TestMiniDFSClusterThreads.java
>
>
> MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many 
> of these are not suitable for a unit test environment. For example, the 
> default handler thread count of 10 is definitely more than necessary for 
> (almost?) any unit test. We should set reasonable, lower defaults unless a 
> test specifically requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-06-08 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506759#comment-16506759
 ] 

Chris Douglas commented on HDFS-13265:
--

bq. I'm thinking that we put the MiniDFSCluster into a minimal resource 
configuration by default, and then go through the broken tests and fix them
+1 Sounds good to me.

bq. Would it be possible to add those in 
hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hdfs-default.xml?
They're a little aggressive, aren't they? It might also be disruptive (and 
difficult to debug) for sites using the current defaults.

> MiniDFSCluster should set reasonable defaults to reduce resource consumption
> 
>
> Key: HDFS-13265
> URL: https://issues.apache.org/jira/browse/HDFS-13265
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13265-branch-2.000.patch, 
> HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, HDFS-13265.001.patch, 
> TestMiniDFSClusterThreads.java
>
>
> MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many 
> of these are not suitable for a unit test environment. For example, the 
> default handler thread count of 10 is definitely more than necessary for 
> (almost?) any unit test. We should set reasonable, lower defaults unless a 
> test specifically requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-06-05 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502361#comment-16502361
 ] 

Chris Douglas commented on HDFS-13265:
--

Sorry for the intermittent attention to the prerequisites, I'm not sure I've 
paged in all the context. With HDFS-13493 and HDFS-13272 committed, is the 
remaining work in this JIRA only to use the config knobs for 
{{MiniDFSCluster}}, in branch-2?

> MiniDFSCluster should set reasonable defaults to reduce resource consumption
> 
>
> Key: HDFS-13265
> URL: https://issues.apache.org/jira/browse/HDFS-13265
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13265-branch-2.000.patch, 
> HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, 
> TestMiniDFSClusterThreads.java
>
>
> MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many 
> of these are not suitable for a unit test environment. For example, the 
> default handler thread count of 10 is definitely more than necessary for 
> (almost?) any unit test. We should set reasonable, lower defaults unless a 
> test specifically requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-06-05 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502249#comment-16502249
 ] 

Chris Douglas commented on HDFS-13186:
--

Only a few minor points:
* {{LocalFileSystemPathHandle}} should be in the {{org.apache.hadoop.fs}} 
package, rather than {{org.apache.hadoop}}
* The changes to {{FileSystem}} no longer appear necessary
* The filesystem field in {{FileSystemMultipartUploader}}, 
{{S3AMultipartUploader}} can be final
* In {{FileSystemMultipartUploader}}, is it an error to have entries with the 
same key in the list?
* Does the contract test for the local FS pass, after adding support for 
{{PathHandle}}?
* {{MultipartUploader}} could use some more javadoc to guide implementers

Otherwise this lgtm.

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router

2018-06-04 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501090#comment-16501090
 ] 

Chris Douglas commented on HDFS-13469:
--

bq. If it's only for debugging purposes, I wonder if it's ok we just document 
this isn't supported and one has to do it directly to the NameNode when 
debugging.
It's not a debugging feature. The path-oriented APIs cannot enforce consistent 
references, so recursive traversal, retrying most operations, and most 
concurrent modifications are prohibitively difficult to implement correctly. By 
exposing a reference API, clients can refer to the same object after it's 
moved/renamed, avoid opening a file renamed on top of the intended referent, 
etc.

The current implementation implements references by exposing inode IDs via a 
magic path. This has the advantage of working with all operations to 
{{FileSystem}}, because they all operate on paths. On the other hand, it is 
brittle and bespoke to HDFS. For HDFS utilities, that's not really a problem. 
For applications, it's a semi-public specialization for HDFS.

For RBF, mapping an inode ID to one of several namespaces- when the referent 
may move across namenodes- may not be strictly compatible with every case the 
NN supports today. References may become spuriously invalid (by rebalancing) or 
the range of inode IDs restricted by partitioning. An alternative approach adds 
a separate API for references. These will not work automatically with all 
{{FileSystem}} operations; each operation will need to be added explicitly, as 
in HDFS-7878. Its implementation currently uses the same metadata as the magic 
path, but because it is an opaque layer of indirection, we could supplement the 
reference with other metadata.

bq. The only compatible approach is conceptually partitioning the inode id 
ranges to disambiguate the location of the inode id.  Simplest implementation 
is namesystems use distinct non-overlapping ranges but that only works for new 
namespaces.
It's literally partitioning the inode ID range, which has the same tradeoffs as 
HDFS-10867. As you pointed out there, adding or removing NNs from an RBF 
cluster will consume part of the range.

bq. Please elaborate on how this might work?  Ie.  What is and how does the 
opaque thing map to the underlying namesystem?  How will a compatible and 
simple traversal work?
OK, start with semantics equivalent to inode ID partitioning. Instead of 
mapping inode IDs to a partition of the range, store the target namesystem in a 
separate field on the {{PathHandle}} reference. RBF can rewrite the reference 
with its own payload. Instead of a magic path, clients will make similar RPC 
calls that include the reference. The NN will extract and use the inode ID; in 
an RBF deployment, the router can use the metadata it added to route the 
request.

Again, partitioning the inode ID space has a much lower implementation cost, 
but there are tradeoffs. If we're accumulating use cases for references, then 
adding a separate API to implement them would (a) work for {{FileSystem}} 
implementations other than HDFS and (b) offer explicit support, rather than an 
undocumented feature. To be clear, I'm not advocating for explicit references 
over magic paths, but if RBF isn't remapping magic paths then it should fail 
instead of blindly forwarding them.

> RBF: Support InodeID in the Router
> --
>
> Key: HDFS-13469
> URL: https://issues.apache.org/jira/browse/HDFS-13469
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> The Namenode supports identifying files through inode identifiers.
> Currently the Router does not handle this properly, we need to add this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13587) TestQuorumJournalManager fails on Windows

2018-05-23 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487801#comment-16487801
 ] 

Chris Douglas commented on HDFS-13587:
--

+1 lgtm

> TestQuorumJournalManager fails on Windows
> -
>
> Key: HDFS-13587
> URL: https://issues.apache.org/jira/browse/HDFS-13587
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Anbang Hu
>Assignee: Anbang Hu
>Priority: Major
>  Labels: Windows
> Attachments: HDFS-13587.000.patch, HDFS-13587.001.patch, 
> HDFS-13587.002.patch
>
>
> There are 12 test failures in TestQuorumJournalManager on Windows. Local run 
> shows:
> {color:#d04437}[INFO] Running 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
> [ERROR] Tests run: 21, Failures: 0, Errors: 12, Skipped: 0, Time elapsed: 
> 106.81 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
> [ERROR] 
> testCrashBetweenSyncLogAndPersistPaxosData(org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager)
>   Time elapsed: 1.93 s  <<< ERROR!
> org.apache.hadoop.hdfs.qjournal.client.QuorumException:
> Could not format one or more JournalNodes. 2 successful responses:
> 127.0.0.1:27044: null [success]
> 127.0.0.1:27064: null [success]
> 1 exceptions thrown:
> 127.0.0.1:27054: Directory 
> E:\OSS\hadoop-branch-2\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\journalnode-1\test-journal
>  is in an inconsistent state: Can't format the storage directory because the 
> current directory is not empty.
> at 
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.checkEmptyCurrent(Storage.java:498)
> at 
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:574)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JNStorage.format(JNStorage.java:185)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.format(Journal.java:221)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.format(JournalNodeRpcServer.java:157)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.format(QJournalProtocolServerSideTranslatorPB.java:145)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25419)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.format(QuorumJournalManager.java:212)
> at 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager.setup(TestQuorumJournalManager.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at 

[jira] [Commented] (HDFS-13587) TestQuorumJournalManager fails on Windows

2018-05-21 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482852#comment-16482852
 ] 

Chris Douglas commented on HDFS-13587:
--

bq. TestQuorumJournalManager originally uses default getBaseDirectory from 
MiniDFSCluster, which sets the variable to true in MiniDFSCluster
Specifically, it causes {{MiniDFSCluster}} to be loaded, which triggers the 
[static 
block|https://github.com/apache/hadoop/blob/132a547/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java#L161]
 that sets this mode. Since the design relies on static initialization 
dependencies, we're unlikely to find a clean solution. Adding a similar, static 
block to {{MiniJournalCluster}} looks correct, and it would avoid fixing the 
same failure in each QJM test. Do you see a problem with this approach?

> TestQuorumJournalManager fails on Windows
> -
>
> Key: HDFS-13587
> URL: https://issues.apache.org/jira/browse/HDFS-13587
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Anbang Hu
>Assignee: Anbang Hu
>Priority: Major
>  Labels: Windows
> Attachments: HDFS-13587.000.patch, HDFS-13587.001.patch
>
>
> There are 12 test failures in TestQuorumJournalManager on Windows. Local run 
> shows:
> {color:#d04437}[INFO] Running 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
> [ERROR] Tests run: 21, Failures: 0, Errors: 12, Skipped: 0, Time elapsed: 
> 106.81 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
> [ERROR] 
> testCrashBetweenSyncLogAndPersistPaxosData(org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager)
>   Time elapsed: 1.93 s  <<< ERROR!
> org.apache.hadoop.hdfs.qjournal.client.QuorumException:
> Could not format one or more JournalNodes. 2 successful responses:
> 127.0.0.1:27044: null [success]
> 127.0.0.1:27064: null [success]
> 1 exceptions thrown:
> 127.0.0.1:27054: Directory 
> E:\OSS\hadoop-branch-2\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\journalnode-1\test-journal
>  is in an inconsistent state: Can't format the storage directory because the 
> current directory is not empty.
> at 
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.checkEmptyCurrent(Storage.java:498)
> at 
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:574)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JNStorage.format(JNStorage.java:185)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.format(Journal.java:221)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.format(JournalNodeRpcServer.java:157)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.format(QJournalProtocolServerSideTranslatorPB.java:145)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25419)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.format(QuorumJournalManager.java:212)
> at 
> org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager.setup(TestQuorumJournalManager.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> 

[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478432#comment-16478432
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Any Jira which has not been released, we should certainly revert. 
A blanket veto over a clerical error is not on the table. A set of unrelated 
features and fixes were unhelpfully linked from a single JIRA. Let's clean it 
up and move significant features- like security- to design doc + branch, 
leaving bug fixes and the like to normal dev on trunk, as we would with any 
other module. Particularly for code that only affects RBF, isolating it on a 
branch doesn't do anything for the stability of trunk.

bq. I don't quite agree with the patch count logic, for I have seen patches 
with are more than 200 KB at times. Let us just say "we know a big feature when 
we see it"
I was referring to the 5/9 patches implementing these features. Legalese 
defining when a branch is required is not a good use of anyone's time. Let's 
not belabor it. However, you requested _all_ development move to a branch, 
based on an impression that you explicitly have not verified. Without seeking 
objective criteria that apply to all situations, you need to research this 
particular patchset to ground your claims about _it_.

Let's be concrete. You have the impression that (a) changes committed to trunk 
affect non-RBF deployments and (b) RBF features miss critical cases. Those are 
demonstrable.

bq. I *feel* that a JIRA that has these following features [...] *Sounds like* 
a major undertaking in my mind and *feels like* these do need a branch and 
these are significant features.
RBF has a significant _roadmap_. A flat list of tasks- with mixed granularity- 
is a poor way to organize it.

bq. [...] since they are not very keen on communicating details, I am proposing 
that you move all this work to a branch and bring it back when the whole idea 
is baked
Not everyone participating in RBF development has worked in OSS projects, 
before. It's fine to explore ideas in code, collaboratively, in JIRA. Failing 
to signal which JIRAs are variations on a theme (e.g., protobuf boilerplate), 
prototypes, or features affecting non-RBF: that's not OK. Reviewers can't 
follow every JIRA, they need help finding the relevant signal.

Your confidence that people working on RBF are applying reasonable filters and 
*soliciting* others' opinion is extremely important. From a random sampling, 
that seems to be happening. Reviewing the code in issues [~arpitagarwal] 
cited... they may "look non-trivial at a glance", but after a slightly longer 
glance, they look pretty straightforward to me. Or at least, they follow from 
the design of the router.

bq. Perhaps there is a communication problem here. I am not sure where your 
assumption comes from; reading the comments on the security patch, I am not 
able to come to that conclusion. Please take a look at HDFS-12284. [...] If we 
both are reading the same comments, I am at a loss on how you came to the 
conclusion that it was a proposal and not a patch.
Committing a patch that claims to add security, without asking for broader 
review, would be absurd. Reading a discussion about a lack of clarity on 
_delegation tokens_ and concluding that group believes its implementation is 
ready to merge... that requires more assumptions about those developers' intent 
and competence than to conclude "prototype". _However_ if it were being 
developed in a branch, that signal would be unambiguous.

bq. It will benefit the RBF feature as well as future maintainers to have a 
design notes or a detailed change description beyond a 1-line summary because 
most are large patches
>From the samples I read, this is a recurring problem. Many RBF JIRAs should 
>have more prose around the design tradeoffs, not just comments on the code. 
>Taking HDFS-13224 as an example, one would need to read the patch to 
>understand what was implemented, and how. Again, most of these follow from the 
>RBF design and the larger patch size often comes from PB boilerplate, but 
>raising the salient details both for review and for future maintainers (I'm 
>glad you brought this up) is not optional.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475338#comment-16475338
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Could you please do me a favor and move all future work and all the JIRAs 
not released into a branch?
"All future work" is too broad, but developing complex features in branches is 
a fair ask. Would you mind looking through the JIRAs implementing quotas and 
multi-subcluster mounts, as an exercise? Reasonable people can disagree on 
minimum criteria requiring a feature branch, but calling a merge vote for 5-9 
patches is probably excessive.

It is important to distinguish between routine maintenance and significant 
features. For the latter, design docs should be written, experts' opinion 
solicited, and implementation should be in a branch if it requires more than a 
handful of patches. The project can't claim that RBF is secure, then issue a 
flurry of CVEs that could have been caught in design review. In that particular 
case, I assume the patch was shared for discussion, not commit.

bq. As I said, I feel bad the way this project is handled, that is heaping lots 
of critical features without even design documents and all these features going 
directly into the trunk.
[~anu], you should look at the code/JIRAs if you're going to make sweeping 
statements like this. Many features did have design documents. Most JIRAs were 
repairs/improvements to released functionality or straightforward extensions. 
The only data point you've cited is the number of subtasks in this JIRA, which 
is a clerical problem, not a stop-the-world emergency.

Since some subset of these JIRAs are already in a release, even that criterion 
doesn't tie these JIRAs together. Your suggestion- moving JIRAs not part of a 
release out of this umbrella- makes sense. Let's wrap up this issue as whatever 
was released, then transition to (a) significant features in branches with 
subtasks and (b) normal maintenance in individual JIRAs.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475098#comment-16475098
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Right now, this JIRA has become too big, and it has interwoven JIRAs(small 
and large) – like you mentioned the Quotas or Multi-subcluster mount points in 
the trunk.
Agreed, the number of subtasks for this JIRA belies any coherent theme for 
"Phase II" items. The bug fixes, perf improvements, docs, unit tests, and API 
cleanup can continue on trunk. As with other modules, they can use tags for 
tracking.

bq. I still think at this stage ( perhaps this should have been done much 
earlier) that we should open a new branch and revert the changes in trunk and 
2.9.
Going through the subtasks, this looks like normal development, just 
over-indexed. If a large feature (like security) relies on a web of JIRAs 
and/or impacts HDFS significantly, then as you say, that's easier to manage on 
a branch.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13284) Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY

2018-05-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474990#comment-16474990
 ] 

Chris Douglas edited comment on HDFS-13284 at 5/14/18 10:49 PM:


Can you be more explicit about the conditions where these heuristics are 
(in)correct? The description documents the current behavior and the proposed 
change, but not what it resolves. If the problem is that blocks with only two 
replicas should be assigned higher priority than those with three (e.g., 
because you're seeing avoidable high-priority replications), would 
{{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? 
Items added to the distributed cache with high replication may be 
over-prioritized, if we change the heuristic from 1:3 to 1:2.

The patch should also update the javadoc, which documents the priority 
assignment.


was (Author: chris.douglas):
Can you be more explicit about the conditions where these heuristics are 
(in)correct? The description documents the current behavior and the proposed 
change, but not what it resolves. If the problem is that files with only two 
replicas should be assigned higher priority than those with three (e.g., 
because you're seeing avoidable high-priority replications), would 
{{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? 
Items added to the distributed cache with high replication may be 
over-prioritized, if we change the heuristic from 1:3 to 1:2.

The patch should also update the javadoc, which documents the priority 
assignment.

> Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY
> -
>
> Key: HDFS-13284
> URL: https://issues.apache.org/jira/browse/HDFS-13284
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13284.000.patch, HDFS-13284.001.patch
>
>
> LowRedundancyBlocks currently has 5 priority queues:
> QUEUE_HIGHEST_PRIORITY = 0                         - reserved for last 
> replica blocks
>  QUEUE_VERY_LOW_REDUNDANCY = 1             - *if ((curReplicas * 3) < 
> expectedReplicas)*
>  QUEUE_LOW_REDUNDANCY = 2                         - the rest
>  QUEUE_REPLICAS_BADLY_DISTRIBUTED = 3
>  QUEUE_WITH_CORRUPT_BLOCKS = 4
> The problem lies in  QUEUE_VERY_LOW_REDUNDANCY. Currently, a block that has 
> curReplicas=2 and expectedReplicas=4 is treated the same as a block with 
> curReplicas=3 and expectedReplicas=4. A block with 2/3 replicas is also put 
> into QUEUE_LOW_REDUNDANCY. 
> The proposal is to change the *{{if ((curReplicas * 3) < expectedReplicas)}}* 
> check to *{{if ((curReplicas * 2) <= expectedReplicas || curReplicas == 2)}}*
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13284) Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY

2018-05-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474990#comment-16474990
 ] 

Chris Douglas commented on HDFS-13284:
--

Can you be more explicit about the conditions where these heuristics are 
(in)correct? The description documents the current behavior and the proposed 
change, but not what it resolves. If the problem is that files with only two 
replicas should be assigned higher priority than those with three (e.g., 
because you're seeing avoidable high-priority replications), would 
{{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? 
Items added to the distributed cache with high replication may be 
over-prioritized, if we change the heuristic from 1:3 to 1:2.

The patch should also update the javadoc, which documents the priority 
assignment.

> Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY
> -
>
> Key: HDFS-13284
> URL: https://issues.apache.org/jira/browse/HDFS-13284
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13284.000.patch, HDFS-13284.001.patch
>
>
> LowRedundancyBlocks currently has 5 priority queues:
> QUEUE_HIGHEST_PRIORITY = 0                         - reserved for last 
> replica blocks
>  QUEUE_VERY_LOW_REDUNDANCY = 1             - *if ((curReplicas * 3) < 
> expectedReplicas)*
>  QUEUE_LOW_REDUNDANCY = 2                         - the rest
>  QUEUE_REPLICAS_BADLY_DISTRIBUTED = 3
>  QUEUE_WITH_CORRUPT_BLOCKS = 4
> The problem lies in  QUEUE_VERY_LOW_REDUNDANCY. Currently, a block that has 
> curReplicas=2 and expectedReplicas=4 is treated the same as a block with 
> curReplicas=3 and expectedReplicas=4. A block with 2/3 replicas is also put 
> into QUEUE_LOW_REDUNDANCY. 
> The proposal is to change the *{{if ((curReplicas * 3) < expectedReplicas)}}* 
> check to *{{if ((curReplicas * 2) <= expectedReplicas || curReplicas == 2)}}*
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads

2018-05-10 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13272:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.2
   Status: Resolved  (was: Patch Available)

> DataNodeHttpServer to have configurable HttpServer2 threads
> ---
>
> Key: HDFS-13272
> URL: https://issues.apache.org/jira/browse/HDFS-13272
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-13272-branch-2.000.patch, 
> HDFS-13272-branch-2.001.patch, testout-HDFS-13272.001, testout-branch-2
>
>
> In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
> threads. In addition to the possibility of this being too few threads, it is 
> much higher than necessary in resource constrained environments such as 
> MiniDFSCluster. To avoid compatibility issues, rather than using 
> {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new 
> configuration for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads

2018-05-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471243#comment-16471243
 ] 

Chris Douglas commented on HDFS-13272:
--

Thanks [~xkrogen], sorry for the delay.

bq. There were no new failures with the patch applied and it took about the 
same time to run all of them
{{TestClientProtocolForPipelineRecovery}} also fails on branch-2, the other 
tests... often pass. This doesn't seem to introduce any regressions.

+1 I committed this

> DataNodeHttpServer to have configurable HttpServer2 threads
> ---
>
> Key: HDFS-13272
> URL: https://issues.apache.org/jira/browse/HDFS-13272
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-13272-branch-2.000.patch, 
> HDFS-13272-branch-2.001.patch, testout-HDFS-13272.001, testout-branch-2
>
>
> In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
> threads. In addition to the possibility of this being too few threads, it is 
> much higher than necessary in resource constrained environments such as 
> MiniDFSCluster. To avoid compatibility issues, rather than using 
> {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new 
> configuration for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-05-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461912#comment-16461912
 ] 

Chris Douglas commented on HDFS-13186:
--

Thanks for the updated patch. Minor comments still apply, but let's work out 
the high-level first.

bq. MPU is distinct from serial copy since we can upload the data out of order. 
I don't see the use case for a version that breaks this. In this case, I think 
the boiler plate is correct: try to make an MPU; if we can't then fall back to 
FileSystem::write

Fair enough. If the caller controls the partitioning, then it can also decide 
whether the copy should be serial or parallel (e.g., for small files). However, 
this also requires the caller to implement recovery paths for both cases. So a 
user of this library would need to persist their {{UploadHandle}} to abort the 
call for a MPU, and bespoke metadata to recover failed, serial copies. That's 
not unreasonable, but it means that the client will probably fall back either 
to copy-in-place/renaming to promote completed output, which (as we've seen 
with S3/WASB and HDFS) doesn't have a good, default answer. If the MPU were 
just a U, then this library could also handle promotion in a FS-efficient way.

All that said, if this is not intended as a small framework, but rather a 
common API to parallel upload machinery across S3/WASB and HDFS, then this is 
perfectly fine. To be a general framework, this would also need a partitioning 
strategy (with overrides) to generate splits, which resolve to {{PartHandle}} 
objects. Probably not worth it.

bq. So the numbering of the parts would be internal to the part handle when 
downcasting in the different types. This means we could no longer rely on the 
fact that it's just a String and would then have to introduce all the protobuf 
machinery. That's possible; just wanted to highlight it before I go away and 
implement that.
bq. the filePath needs to follow the upload around because various nodes that 
are calling put part will need to know what type of MultipartUploader to make
bq. If we need to serialize anything, I think we should have it be explicitly 
done through protobuf

As with {{PathHandle}} implementations, PB might be useful, but not required. 
But your point is well taken. Since this would be passed across multiple nodes 
that may include different versions of the impl's MPU on the classpath, that 
could create some unnecessarily complicated logic for the MPU implementer. 
Let's stick with the approach in v004.

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router

2018-05-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459854#comment-16459854
 ] 

Chris Douglas commented on HDFS-13469:
--

bq. We can't close a serious oversight/regression in the router by essentially 
throwing TooHardToSupportException.
Exposing and addressing inodes through a magic path is not just hard to extend, 
it's impossible without compromising compatibility. Option one, we bypass the 
restricted ID space by changing the abstraction: add explicit, opaque handles 
to files/directories and use those for directory enumeration/traversal, 
fetching block locations, etc. Option two, we hack it into inodeid and continue 
using the magic path. If there are other options then let's explore them, but 
the constraints are understood and don't need to be belabored.

HDFS-7878 was attempting to introduce option one. It's only in 3.1.0, so if it 
requires extension/modification to serve that purpose then let's do it. Option 
two has tradeoffs, as in HDFS-10867. How would you like to address this for RBF?

> RBF: Support InodeID in the Router
> --
>
> Key: HDFS-13469
> URL: https://issues.apache.org/jira/browse/HDFS-13469
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> The Namenode supports identifying files through inode identifiers.
> Currently the Router does not handle this properly, we need to add this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13381) [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path

2018-04-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459263#comment-16459263
 ] 

Chris Douglas commented on HDFS-13381:
--

bq. We need one unified implementation that uses minimal external calls that 
can be abstracted in the context to either make rpc or fsn calls
Do you have a particular refactoring in mind? In this case, the separate 
implementations batch their requests differently, because the internal 
implementation uses {{FSTreeTraverser}}. Looking at the patch, one could make 
the two implementations more similar, but not significantly more modular or 
easier to maintain. IIRC, each context implements "queue up the next batch of 
work".

Your point on maintenance is well taken, but {{FileCollector}} looks like a 
vestigial abstraction. It's not actually pluggable; the internal/external 
contexts put their scan logic into a class, to match the EC pattern. We are 
only talking about two implementations of an explicitly internal (i.e., 
{{Unstable}}) API...

Is the problem with the {{FSTreeTraverser}} pattern, generally? You've stated 
the goal clearly, but if the traversal is generalized then the internal context 
will do no batching under the lock. Did anyone- including the EC folks- publish 
any benchmarks/testing that justify this abstraction? If the impl benefits from 
it, then as a specialization it has merit, but it should be significant. How 
should we test that?

> [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path
> 
>
> Key: HDFS-13381
> URL: https://issues.apache.org/jira/browse/HDFS-13381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
> Attachments: HDFS-13381-HDFS-10285-00.patch, 
> HDFS-13381-HDFS-10285-01.patch
>
>
> This Jira task will address the following comments:
>  # Use DFSUtilClient::makePathFromFileId, instead of generics(one for string 
> path and another for inodeId) like today.
>  # Only the context impl differs for external/internal sps. Here, it can 
> simply move FileCollector and BlockMoveTaskHandler to Context interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13495) RBF: Support Router Admin REST API

2018-04-25 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453285#comment-16453285
 ] 

Chris Douglas commented on HDFS-13495:
--

bq. do you have any insight on why there is no REST API for admin?
Not sure. Often REST APIs are exposed through a gateway, like Knox. I don't 
know if these expose admin operations. IIRC there's also a servlet filter that 
rejects impersonation of admin/system accounts. [~lmc...@apache.org]?

> RBF: Support Router Admin REST API
> --
>
> Key: HDFS-13495
> URL: https://issues.apache.org/jira/browse/HDFS-13495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Mohammad Arshad
>Priority: Major
>  Labels: RBF
>
> This JIRA intends to add REST API support for all admin commands. Router 
> Admin REST APIs can be useful in managing the Routers from a central 
> management layer tool. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-04-24 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449420#comment-16449420
 ] 

Chris Douglas commented on HDFS-13186:
--

I like this design. The {{MultiPartUploader}} avoids adding new hooks to 
{{FileSystem}}, without losing generality or pluggability.

High level:
* The current impl doesn't define a default for {{FileSystem}} implementations, 
which could be a serial copy. Instead, it throws an exception. Utilities (like 
{{FsShell}} or YARN) need to implement some boilerplate for both paths, rather 
than using a single path that falls back to a serial upload.
* Some implementations might benefit from an explicit 
{{MultipartUploader::abort}}, which may clean up the partial upload. Clearly it 
can't be guaranteed, but we'd like the property that an {{UploadHandle}} 
persisted to a WAL could be used for cleanup.
* The {{PartHandle}} could retain its ID, rather than providing a 
{{Pair}} to {{commit}}. This might make repartitioning 
difficult i.e., splitting a slow {{PartHandle}}, but implementations could 
implement custom handling if that's important. It would be sufficient for the 
{{PartHandle}} to be {{Comparable}}, though equality should be treated either 
as a duplicate or an error at {{complete}} by the {{MultipartUploader}}.
* Does the {{UploadHandle}} init ever vary, depending on the src? Intra-FS 
copies?
* Right now, the utility doesn't offer an API to partition a file, or to create 
(bounded) {{InputStream}} args to {{putPart}}.

Minor:
{{MultipartUploader}}
* IIRC {{ClassUtil::findContainingJar}} is expensive; guard using {{if 
(LOG.isDebugEnabled)}}
* Might be worth warning if two {{MultipartUploader}} impls report the same 
scheme
* The typical {{ServiceLoader}} pattern returns a factory object that produces 
instances, rather than matching the class and producing instances by 
reflection. This way, the factory instance can make some addtional checks 
and/or handle scheme collisions by chaining. It also avoids the {{initialize}} 
pattern, since the factory can invoke the cstr. The lookup is slower (scan of 
loaded uploaders, vs map lookup), but for a small number of instances.
* Do {{putPart}} and {{complete}} need the {{filePath}} parameter?
* Rename {{MultipartUploader}} to {{Uploader}}?
{{BBPartHandle}}
* While it's unlikely to be an issue, direct {{ByteBuffer}}s don't support 
{{array()}}. Is this to support {{Serializable}}}?

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode

2018-04-23 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448701#comment-16448701
 ] 

Chris Douglas commented on HDFS-13283:
--

[~lukmajercak], could you regenerate the patch? +1 on the changes

> Percentage based Reserved Space Calculation for DataNode
> 
>
> Key: HDFS-13283
> URL: https://issues.apache.org/jira/browse/HDFS-13283
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, 
> HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, 
> HDFS-13283.005.patch, HDFS-13283.006.patch
>
>
> Currently, the only way to configure reserved disk space for non-HDFS data on 
> a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be 
> an issue in non-heterogeneous clusters where size of DNs can differ. The 
> proposed solution is to allow percentage based configuration (and their 
> combination):
>  # ABSOLUTE
>  ** based on absolute number of reserved space
>  # PERCENTAGE
>  ** based on percentage of total capacity in the storage
>  # CONSERVATIVE
>  ** calculates both of the above and takes the one that will yield more 
> reserved space
>  # AGGRESSIVE
>  ** calculates 1. 2. and takes the one that will yield less reserved space
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-23 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.3
   2.9.2
   3.1.1
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

I committed this. Thanks [~surmountian]

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3
>
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10867) [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage

2018-04-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446483#comment-16446483
 ] 

Chris Douglas commented on HDFS-10867:
--

Looking through the patch for HDFS-13350, it looks like it's handling false 
positives where legacy blocks were treated as EC blocks, not the other way 
around. If we're already identifying legacy blocks correctly, then we need to 
avoid the same pitfall, but the rest of the code should be OK.

> [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage
> -
>
> Key: HDFS-10867
> URL: https://issues.apache.org/jira/browse/HDFS-10867
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Priority: Major
> Attachments: Block Bit Field Allocation of Provided Storage.pdf
>
>
> We wish to design and implement the following related features for provided 
> storage:
> # Dynamic mounting of provided storage within a Namenode (mount, unmount)
> # Mount multiple provided storage systems on a single Namenode.
> # Support updates to the provided storage system without having to regenerate 
> an fsimg.
> A mount in the namespace addresses a corresponding set of block data. When 
> unmounted, any block data associated with the mount becomes invalid and 
> (eventually) unaddressable in HDFS. As with erasure-coded blocks, efficient 
> unmounting requires that all blocks with that attribute be identifiable by 
> the block management layer
> In this subtask, we focus on changes and conventions to the block management 
> layer. Namespace operations are covered in a separate subtask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads

2018-04-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446469#comment-16446469
 ] 

Chris Douglas commented on HDFS-13272:
--

Thanks, [~kihwal]. [~xkrogen], does this unblock you on HDFS-13265?

> DataNodeHttpServer to have configurable HttpServer2 threads
> ---
>
> Key: HDFS-13272
> URL: https://issues.apache.org/jira/browse/HDFS-13272
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
> threads. In addition to the possibility of this being too few threads, it is 
> much higher than necessary in resource constrained environments such as 
> MiniDFSCluster. To avoid compatibility issues, rather than using 
> {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new 
> configuration for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446457#comment-16446457
 ] 

Chris Douglas commented on HDFS-13408:
--

For completeness, added v004 which considers a {{null}} base directory to the 
new cstr an error.

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-20 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: HDFS-13408.004.patch

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch, HDFS-13408.003.patch, HDFS-13408.004.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446454#comment-16446454
 ] 

Chris Douglas commented on HDFS-13408:
--

bq. there should be file read/write conflicts between test cases using the same 
base dir and test data file names, if running in parallel.
OK. Perhaps we could [use the ID from 
surefire|https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html]
 to avoid these collisions. We still need a solution either for tests run 
sequentially, and this is a good workaround.

bq. Regarding the warning for setting both approaches, I'm fine with it and I 
would even vote for an exception to avoid having inconsistencies
Soright. Updated patch, which also rolls back the slf4j change.

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch, HDFS-13408.003.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-20 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: HDFS-13408.003.patch

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch, HDFS-13408.003.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13381) [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445356#comment-16445356
 ] 

Chris Douglas commented on HDFS-13381:
--

[~daryn]: to over-simplify [~rakeshr]'s description, the 
{{IntraSPSNameNodeFileIdCollector}} adds additional logic is for batching i.e., 
implementing {{FSTreeTraverser}} hooks, borrowing the pattern from encryption 
zones. The external/internal separation seems intact, as both {{FileCollector}} 
implementations are only accessible through their respective {{Context}}. This 
might be emphasized by removing the {{FileCollector}} interface, since it 
serves no particular purpose. Would that be sufficient?

> [SPS]: Use DFSUtilClient#makePathFromFileId() to prepare satisfier file path
> 
>
> Key: HDFS-13381
> URL: https://issues.apache.org/jira/browse/HDFS-13381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
> Attachments: HDFS-13381-HDFS-10285-00.patch, 
> HDFS-13381-HDFS-10285-01.patch
>
>
> This Jira task will address the following comments:
>  # Use DFSUtilClient::makePathFromFileId, instead of generics(one for string 
> path and another for inodeId) like today.
>  # Only the context impl differs for external/internal sps. Here, it can 
> simply move FileCollector and BlockMoveTaskHandler to Context interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445289#comment-16445289
 ] 

Chris Douglas commented on HDFS-13283:
--

Wrong syntax for the inner class in {{hdfs-default.xml}}. Fixed, trying again.

> Percentage based Reserved Space Calculation for DataNode
> 
>
> Key: HDFS-13283
> URL: https://issues.apache.org/jira/browse/HDFS-13283
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, 
> HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, 
> HDFS-13283.005.patch
>
>
> Currently, the only way to configure reserved disk space for non-HDFS data on 
> a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be 
> an issue in non-heterogeneous clusters where size of DNs can differ. The 
> proposed solution is to allow percentage based configuration (and their 
> combination):
>  # ABSOLUTE
>  ** based on absolute number of reserved space
>  # PERCENTAGE
>  ** based on percentage of total capacity in the storage
>  # CONSERVATIVE
>  ** calculates both of the above and takes the one that will yield more 
> reserved space
>  # AGGRESSIVE
>  ** calculates 1. 2. and takes the one that will yield less reserved space
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13283:
-
Attachment: HDFS-13283.005.patch

> Percentage based Reserved Space Calculation for DataNode
> 
>
> Key: HDFS-13283
> URL: https://issues.apache.org/jira/browse/HDFS-13283
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, 
> HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch, 
> HDFS-13283.005.patch
>
>
> Currently, the only way to configure reserved disk space for non-HDFS data on 
> a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be 
> an issue in non-heterogeneous clusters where size of DNs can differ. The 
> proposed solution is to allow percentage based configuration (and their 
> combination):
>  # ABSOLUTE
>  ** based on absolute number of reserved space
>  # PERCENTAGE
>  ** based on percentage of total capacity in the storage
>  # CONSERVATIVE
>  ** calculates both of the above and takes the one that will yield more 
> reserved space
>  # AGGRESSIVE
>  ** calculates 1. 2. and takes the one that will yield less reserved space
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: HDFS-13408.002.patch

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch, 
> HDFS-13408.002.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: HDFS-13408.002.patch

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: (was: HDFS-13408.002.patch)

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445046#comment-16445046
 ] 

Chris Douglas commented on HDFS-13283:
--

Rebased patch. Minor changes:
* Renamed static, inner class {{ReservedSpaceCalculatorBuilder}} to {{Builder}}
* Changed invalid spec to throw {{IllegalArgumentException}} instead of failing 
back to absolute
* Changed {{ReservedSpaceCalculatorAggressive}} to extend 
{{ReservedSpaceCalculator}} rather than another, similar option
* Moved created of {{DF}} into the {{Builder}}

Missed [~virajith]'s comments on {{hdfs-default}}, +1 after those are addressed.

> Percentage based Reserved Space Calculation for DataNode
> 
>
> Key: HDFS-13283
> URL: https://issues.apache.org/jira/browse/HDFS-13283
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, 
> HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch
>
>
> Currently, the only way to configure reserved disk space for non-HDFS data on 
> a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be 
> an issue in non-heterogeneous clusters where size of DNs can differ. The 
> proposed solution is to allow percentage based configuration (and their 
> combination):
>  # ABSOLUTE
>  ** based on absolute number of reserved space
>  # PERCENTAGE
>  ** based on percentage of total capacity in the storage
>  # CONSERVATIVE
>  ** calculates both of the above and takes the one that will yield more 
> reserved space
>  # AGGRESSIVE
>  ** calculates 1. 2. and takes the one that will yield less reserved space
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13283) Percentage based Reserved Space Calculation for DataNode

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13283:
-
Attachment: HDFS-13283.004.patch

> Percentage based Reserved Space Calculation for DataNode
> 
>
> Key: HDFS-13283
> URL: https://issues.apache.org/jira/browse/HDFS-13283
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13283.000.patch, HDFS-13283.001.patch, 
> HDFS-13283.002.patch, HDFS-13283.003.patch, HDFS-13283.004.patch
>
>
> Currently, the only way to configure reserved disk space for non-HDFS data on 
> a DataNode is a constant value via {{dfs.datanode.du.reserved}}. This can be 
> an issue in non-heterogeneous clusters where size of DNs can differ. The 
> proposed solution is to allow percentage based configuration (and their 
> combination):
>  # ABSOLUTE
>  ** based on absolute number of reserved space
>  # PERCENTAGE
>  ** based on percentage of total capacity in the storage
>  # CONSERVATIVE
>  ** calculates both of the above and takes the one that will yield more 
> reserved space
>  # AGGRESSIVE
>  ** calculates 1. 2. and takes the one that will yield less reserved space
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444963#comment-16444963
 ] 

Chris Douglas commented on HDFS-13408:
--

bq. we are changing to logger here; is it fine to do it here given we want to 
commit all the way to branch-2.9?
Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is 
{{private}} and we use the slf4j bindings in branch-2 e.g., HDFS-15394... but 
it's not necessary, I just changed it by habit. Should I change it back?

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444963#comment-16444963
 ] 

Chris Douglas edited comment on HDFS-13408 at 4/19/18 11:10 PM:


bq. we are changing to logger here; is it fine to do it here given we want to 
commit all the way to branch-2.9?
Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is 
{{private}} and we use the slf4j bindings in branch-2 e.g., HADOOP-15394... but 
it's not necessary, I just changed it by habit. Should I change it back?


was (Author: chris.douglas):
bq. we are changing to logger here; is it fine to do it here given we want to 
commit all the way to branch-2.9?
Oh, if other projects/tools write to {{MiniDFSCluster.LOG}}? The field is 
{{private}} and we use the slf4j bindings in branch-2 e.g., HDFS-15394... but 
it's not necessary, I just changed it by habit. Should I change it back?

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13408:
-
Attachment: HDFS-13408.001.patch

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13408) MiniDFSCluster to support being built on randomized base directory

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444921#comment-16444921
 ] 

Chris Douglas commented on HDFS-13408:
--

Sorry for the delay. Would the failing tests also fail to run in parallel, even 
on non-Windows platforms? Is it clear why cleanup fails on Windows?

[~surmountian]: just to clarify/document the problem this is solving, many 
tests rely on the default base directory by hard-coding the static 
{{MiniDFSCluster::getBaseDirectory}}, assuming it will match the base directory 
of the {{MiniDFSCluster}}. To avoid breaking tests outside the project that use 
this pattern, this JIRA proposes to add optional randomization to the 
{{Builder}}, then fix failing tests by enabling this option (e.g., HDFS-13336). 
Is that correct?

Adding {{getRandomizedDfsBaseDir}} to the {{Builder}} requires the creator to 
retain a reference to the {{Builder}} or the {{Configuration}}. It might be 
cleaner to pass an optional base directory into the {{Builder}} cstr, which 
defaults to {{getBaseDirectory}}, then explicitly use 
{{GenericTestUtils::getRandomizedTempPath}} from tests. Attaching a sketch as 
v001, but +1 on v000 if you prefer it.

bq. After creating the random DFS Base Dir you set that value to 
HDFS_MINIDFS_BASEDIR. What if we have already this value set?
Should this be an error/warning? If the caller is asking that the base 
directory be both randomized and set explicitly, those directives contract each 
other.

> MiniDFSCluster to support being built on randomized base directory
> --
>
> Key: HDFS-13408
> URL: https://issues.apache.org/jira/browse/HDFS-13408
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13408.000.patch, HDFS-13408.001.patch
>
>
> Generated files of MiniDFSCluster during test are not properly cleaned in 
> Windows, which fails all subsequent test cases using the same default 
> directory (Windows does not allow other processes to delete them). By 
> migrating to randomized base directories, the conflict of test path of test 
> cases will be avoided, even if they are running at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router

2018-04-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1637#comment-1637
 ] 

Chris Douglas commented on HDFS-13469:
--

{quote}
It's really as simple as:  the path /.reserved/.inodes/123 will access inode 
123 which may really be /user/daryn/dir/mystuff.  The nfs implementation relies 
on inode paths.

HDFS-7878 cannot solve the problem because its an abstraction for a different 
purpose.
{quote}
Agreed, but HDFS-7878 is implemented using the inode path. If the router 
doesn't encode the namespace in the INodeID, then it could be a payload in a 
{{PathHandle}}. Right now, that covers only one case for the inode path i.e., 
{{open}}. [~elgoiri] likely raised it because we could add handling for other 
operations. That might recommend moving the logic from the client to the NN.

The router could remap the INodeIDs to encode the namespace, but if the 
namespace is rebalanced, then INodeIDs could become invalid even when the 
destination still "exists". Clients' interpretation of the error as a deleted 
file would be incompatible. The router would need to partition the INodeID 
space among all the NNs. If a client persists an INodeID and expects it to 
remain valid- because that works for a single NN- that partitioning would need 
to account for every NN that had ever registered with the router. This might 
still be the better approach, since it has a much lower implementation cost and 
covers more cases, but it has tradeoffs.

bq. For now, it might be good to throw an unsupported exception if we get 
accesses to an inode path.
+1

> RBF: Support InodeID in the Router
> --
>
> Key: HDFS-13469
> URL: https://issues.apache.org/jira/browse/HDFS-13469
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> The Namenode supports identifying files through inode identifiers.
> Currently the Router does not handle this properly, we need to add this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13311) RBF: TestRouterAdminCLI#testCreateInvalidEntry fails on Windows

2018-04-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441670#comment-16441670
 ] 

Chris Douglas commented on HDFS-13311:
--

bq. Chris Douglas, should I move this to HADOOP?
Sorry, missed this. In the future, renaming the JIRA to describe the change 
(and citing the failure it's fixing) would probably be easier to work with, but 
it's not critical.

> RBF: TestRouterAdminCLI#testCreateInvalidEntry fails on Windows
> ---
>
> Key: HDFS-13311
> URL: https://issues.apache.org/jira/browse/HDFS-13311
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
>  Labels: RBF, windows
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3
>
> Attachments: HDFS-13311.000.patch
>
>
> The Windows runs show that TestRouterAdminCLI#testCreateInvalidEntry fail 
> with NPE:
> {code}
> [ERROR] 
> testCreateInvalidEntry(org.apache.hadoop.hdfs.server.federation.router.TestRouterAdminCLI)
>   Time elapsed: 0.008 s  <<< ERROR!
> java.lang.NullPointerException
> at 
> org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:529)
> at 
> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:568)
> at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
> at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouterAdminCLI.testCreateInvalidEntry(TestRouterAdminCLI.java:444)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads

2018-04-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441657#comment-16441657
 ] 

Chris Douglas commented on HDFS-13272:
--

Ping [~kihwal], [~xkrogen]

> DataNodeHttpServer to have configurable HttpServer2 threads
> ---
>
> Key: HDFS-13272
> URL: https://issues.apache.org/jira/browse/HDFS-13272
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
> threads. In addition to the possibility of this being too few threads, it is 
> much higher than necessary in resource constrained environments such as 
> MiniDFSCluster. To avoid compatibility issues, rather than using 
> {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new 
> configuration for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13469) RBF: Support InodeID in the Router

2018-04-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441564#comment-16441564
 ] 

Chris Douglas commented on HDFS-13469:
--

HDFS-7878 didn't add an identifier to the payload that could be used for this, 
though [~jingzhao] 
[suggested|https://issues.apache.org/jira/browse/HDFS-7878?focusedCommentId=15468143=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15468143]
 the name service ID. It would be a good sanity check, even in clusters that 
don't run a router.

> RBF: Support InodeID in the Router
> --
>
> Key: HDFS-13469
> URL: https://issues.apache.org/jira/browse/HDFS-13469
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> The Namenode supports identifying files through inode identifiers.
> Currently the Router does not handle this properly, we need to add this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10867) [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage

2018-03-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419934#comment-16419934
 ] 

Chris Douglas commented on HDFS-10867:
--

bq. Don't forget about legacy negative block ids that already aren't compatible 
with EC assumptions.  You can't rob the upper bits of a block id.

*nod* we saw HDFS-13350. Each partition increases the chances of a collision 
with a legacy ID (HDFS-898, particularly [~szetszwo]'s analysis). I'd stop 
short of considering partitioning as "robbing" the block ID namespace, though. 
We need a fix for EC blocks, and if our solution doesn't also cover this case, 
then it's probably incomplete. Since these features are only in 3.x, I would be 
comfortable adding a step to the upgrade guide that instructs users to run fsck 
and re-copy files with legacy block IDs (or never use these features). Of 
course, a tool or comprehensive solution would be better.

When crossing a major version, we can stop supporting some cases. Requiring 
work from users to eliminate legacy block IDs as an ongoing concern... seems 
like an OK tradeoff, to me.

> [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage
> -
>
> Key: HDFS-10867
> URL: https://issues.apache.org/jira/browse/HDFS-10867
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Priority: Major
> Attachments: Block Bit Field Allocation of Provided Storage.pdf
>
>
> We wish to design and implement the following related features for provided 
> storage:
> # Dynamic mounting of provided storage within a Namenode (mount, unmount)
> # Mount multiple provided storage systems on a single Namenode.
> # Support updates to the provided storage system without having to regenerate 
> an fsimg.
> A mount in the namespace addresses a corresponding set of block data. When 
> unmounted, any block data associated with the mount becomes invalid and 
> (eventually) unaddressable in HDFS. As with erasure-coded blocks, efficient 
> unmounting requires that all blocks with that attribute be identifiable by 
> the block management layer
> In this subtask, we focus on changes and conventions to the block management 
> layer. Namespace operations are covered in a separate subtask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13330) ShortCircuitCache#fetchOrCreate never retries

2018-03-28 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418209#comment-16418209
 ] 

Chris Douglas commented on HDFS-13330:
--

bq. Could you also fix the findbugs error?
In this case, the findbugs warning is an artifact of the infinite loop. It 
infers (correctly) that the only way to exit the loop is for {{info}} to be 
assigned a non-null value. As you point out, it should also exit the loop 
before assigning {{info}} when {{replicaInfoMap.get(key)}} returns null, as 
[above|https://issues.apache.org/jira/browse/HDFS-13330?focusedCommentId=16412164=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16412164].

Since this retry code has never been enabled, this functionality is either 
unused, reliable "enough", or some other layer implements the retry logic. 
Changing this to a {{for}} loop with a low, fixed number of retries (i.e., 2, 3 
at the most) is probably sufficient. We'd like to know if/how it's used in 
applications, but absent that analysis, fixing it to work "as designed" is 
probably the best we're going to do.

bq. In addition, a test case for this method is greatly appreciated.
+1 The retry semantics of the committed code were clearly untested.

> ShortCircuitCache#fetchOrCreate never retries
> -
>
> Key: HDFS-13330
> URL: https://issues.apache.org/jira/browse/HDFS-13330
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13330.001.patch, HDFS-13330.002.patch
>
>
> The follow do .. while(false) loop seems useless to me. The code intended to 
> retry but it never worked. Let's fix it.
> {code:java:title=ShortCircuitCache#fetchOrCreate}
> ShortCircuitReplicaInfo info = null;
> do {
>   if (closed) {
> LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
> this, key);
> return null;
>   }
>   Waitable waitable = replicaInfoMap.get(key);
>   if (waitable != null) {
> try {
>   info = fetch(key, waitable);
> } catch (RetriableException e) {
>   LOG.debug("{}: retrying {}", this, e.getMessage());
> }
>   }
> } while (false);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13272) DataNodeHttpServer to have configurable HttpServer2 threads

2018-03-27 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416513#comment-16416513
 ] 

Chris Douglas commented on HDFS-13272:
--

bq. Rather than making it configurable, it might be better to simply reduce it. 
This makes sense to branch-3.x. For older branches, there still might be hftp 
users
Reducing the default makes sense for both 3.x and branch-2. [~kihwal], to be 
clear, should it be configurable on branch-2 so it can be raised for hftp 
users, but dropped to the minimum for 3.x?

> DataNodeHttpServer to have configurable HttpServer2 threads
> ---
>
> Key: HDFS-13272
> URL: https://issues.apache.org/jira/browse/HDFS-13272
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
> threads. In addition to the possibility of this being too few threads, it is 
> much higher than necessary in resource constrained environments such as 
> MiniDFSCluster. To avoid compatibility issues, rather than using 
> {{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new 
> configuration for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13336) Test cases of TestWriteToReplica failed in windows

2018-03-27 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416372#comment-16416372
 ] 

Chris Douglas commented on HDFS-13336:
--

bq. a generic fix could be made for 
org.apache.hadoop.hdfs.MiniDFSCluster#determineDfsBaseDir, if by default a 
randomized temp path is returned

This sounds more sustainable to me, not just for Windows compatibility, but to 
avoid tests that assume a fixed, default base dir.

> Test cases of TestWriteToReplica failed in windows
> --
>
> Key: HDFS-13336
> URL: https://issues.apache.org/jira/browse/HDFS-13336
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13336.000.patch, HDFS-13336.001.patch, 
> HDFS-13336.002.patch
>
>
> Test cases of TestWriteToReplica failed in windows with errors like:
> h4. 
> !https://builds.apache.org/static/fc5100d0/images/16x16/document_delete.png!  
> Error Details
> Could not fully delete 
> F:\short\hadoop-trunk-win\s\hadoop-hdfs-project\hadoop-hdfs\target\test\data\1\dfs\name-0-1
> h4. 
> !https://builds.apache.org/static/fc5100d0/images/16x16/document_delete.png!  
> Stack Trace
> java.io.IOException: Could not fully delete 
> F:\short\hadoop-trunk-win\s\hadoop-hdfs-project\hadoop-hdfs\target\test\data\1\dfs\name-0-1
>  at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1011)
>  at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:932)
>  at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:864)
>  at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:497) at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:456) 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica.testAppend(TestWriteToReplica.java:89)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:309) at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) 
> at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2018-03-26 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414562#comment-16414562
 ] 

Chris Douglas commented on HDFS-8979:
-

bq. The patch did not break HDFS-13330, if you understand the problem of that 
retry logic correctly
It did not break the retry loop, but the fix to the checkstyle warning missed 
the actual problem. Checkstyle warnings aren't only cosmetic. Resolving 
HDFS-8979 and HDFS-13330, we would remove the signal from checkstyle where we 
had a real bug. Bulk fixes make it difficult for reviewers to find cases like 
HDFS-13330 that deserve closer attention.

bq. Unless someone volunteers to review such huge patch again, I am inclined to 
revert this patch from all branches.
We're more likely to introduce new bugs by reverting this. It's a large patch, 
covering a large swath of the codebase, committed more than two years ago. Most 
of the fixes are correct.

If we're going to take a lesson from this, then we should probably avoid 
backporting bulk fixes from static analysis tools. While it can make subsequent 
backports smoother by keeping the code similar, the analysis only applies to 
the branch on which the tool is run e.g., if it can prove something dead in 
trunk, that doesn't mean it's dead in the backported code.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch, 
> HDFS-8979.002.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13330) Clean up dead code

2018-03-26 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414493#comment-16414493
 ] 

Chris Douglas commented on HDFS-13330:
--

bq. Instead of fixing this jira, I am more inclined to revert HDFS-8979 which 
already broke 2 other functionalities.
To be fair to HDFS-8979, it didn't break the retry functionality. The retry 
loop never worked. Static analysis tools are identifying dead code, and these 
JIRAs are removing it instead of asking why it's dead. In this case, the code 
assumed that in do/while loops, the loop expression is not evaluated after a 
{{continue}}, which is [not the 
case|https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.13.1].

Whether we fix the retry logic in this JIRA or we open a new one, I think we 
can agree that simply removing the dead code isn't making meaningful progress.

> Clean up dead code
> --
>
> Key: HDFS-13330
> URL: https://issues.apache.org/jira/browse/HDFS-13330
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HDFS-13330.001.patch
>
>
> The follow do .. while(false) loop seems useless to me.
> {code:java:title=ShortCircuitCache#fetchOrCreate}
> ShortCircuitReplicaInfo info = null;
> do {
>   if (closed) {
> LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
> this, key);
> return null;
>   }
>   Waitable waitable = replicaInfoMap.get(key);
>   if (waitable != null) {
> try {
>   info = fetch(key, waitable);
> } catch (RetriableException e) {
>   LOG.debug("{}: retrying {}", this, e.getMessage());
> }
>   }
> } while (false);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13330) Clean up dead code

2018-03-23 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412165#comment-16412165
 ] 

Chris Douglas commented on HDFS-13330:
--

Which is to say: we should either remove the retry logic entirely or fix it. 
The dead code is a symptom of broken functionality.

> Clean up dead code
> --
>
> Key: HDFS-13330
> URL: https://issues.apache.org/jira/browse/HDFS-13330
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Trivial
>  Labels: newbie
> Attachments: HDFS-13330.001.patch
>
>
> The follow do .. while(false) loop seems useless to me.
> {code:java}
> ShortCircuitReplicaInfo info = null;
> do {
>   if (closed) {
> LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
> this, key);
> return null;
>   }
>   Waitable waitable = replicaInfoMap.get(key);
>   if (waitable != null) {
> try {
>   info = fetch(key, waitable);
> } catch (RetriableException e) {
>   LOG.debug("{}: retrying {}", this, e.getMessage());
> }
>   }
> } while (false);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13330) Clean up dead code

2018-03-23 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412164#comment-16412164
 ] 

Chris Douglas commented on HDFS-13330:
--

It looks like the original intent was a retry loop, to repeat operations that 
throw a {{RetriableException}}. The {{continue}} was removed in HDFS-8979, 
which makes sense, because it [doesn't work as 
intended|https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.13.1].
 The intended logic appears to be something like:
{code:java}
  while (true) {
if (closed) {
  LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
  this, key);
  return null;
}
Waitable waitable = replicaInfoMap.get(key);
if (waitable == null) {
  break;
}
try {
  info = fetch(key, waitable);
} catch (RetriableException e) {
  LOG.debug("{}: retrying {}", this, e.getMessage());
  continue;
}
break;
  }
{code}

> Clean up dead code
> --
>
> Key: HDFS-13330
> URL: https://issues.apache.org/jira/browse/HDFS-13330
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Trivial
>  Labels: newbie
> Attachments: HDFS-13330.001.patch
>
>
> The follow do .. while(false) loop seems useless to me.
> {code:java}
> ShortCircuitReplicaInfo info = null;
> do {
>   if (closed) {
> LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
> this, key);
> return null;
>   }
>   Waitable waitable = replicaInfoMap.get(key);
>   if (waitable != null) {
> try {
>   info = fetch(key, waitable);
> } catch (RetriableException e) {
>   LOG.debug("{}: retrying {}", this, e.getMessage());
> }
>   }
> } while (false);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13330) Clean up dead code

2018-03-22 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410304#comment-16410304
 ] 

Chris Douglas commented on HDFS-13330:
--

Haven't looked at the context, but is there a missing {{continue}} in the catch?

> Clean up dead code
> --
>
> Key: HDFS-13330
> URL: https://issues.apache.org/jira/browse/HDFS-13330
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Priority: Trivial
>  Labels: newbie
>
> The follow do .. while(false) loop seems useless to me.
> {code:java}
> ShortCircuitReplicaInfo info = null;
> do {
>   if (closed) {
> LOG.trace("{}: can't fethchOrCreate {} because the cache is closed.",
> this, key);
> return null;
>   }
>   Waitable waitable = replicaInfoMap.get(key);
>   if (waitable != null) {
> try {
>   info = fetch(key, waitable);
> } catch (RetriableException e) {
>   LOG.debug("{}: retrying {}", this, e.getMessage());
> }
>   }
> } while (false);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-03-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1640#comment-1640
 ] 

Chris Douglas commented on HDFS-13265:
--

I'm +1 on the patch, FWIW. If HADOOP-15311 will be in branch-2, then we should 
try to apply these in a consistent order and backport it first. If not, then 
I'll commit these as they are.

> MiniDFSCluster should set reasonable defaults to reduce resource consumption
> 
>
> Key: HDFS-13265
> URL: https://issues.apache.org/jira/browse/HDFS-13265
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13265-branch-2.000.patch, 
> HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, 
> TestMiniDFSClusterThreads.java
>
>
> MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many 
> of these are not suitable for a unit test environment. For example, the 
> default handler thread count of 10 is definitely more than necessary for 
> (almost?) any unit test. We should set reasonable, lower defaults unless a 
> test specifically requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13296) GenericTestUtils#getTempPath and GenericTestUtils#getRandomizedTestDir generate paths with drive letter in Windows, and fail webhdfs related test cases

2018-03-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403121#comment-16403121
 ] 

Chris Douglas commented on HDFS-13296:
--

bq. Chris Douglas, Subru Krishnan, any idea on how we can trigger HDFS unit 
tests and not only Commons?
Sorry, not that I know of. Finding unit tests that use this utility class and 
running those manually is probably a sufficient spot check. We can fix 
regressions if they come up.

> GenericTestUtils#getTempPath and GenericTestUtils#getRandomizedTestDir 
> generate paths with drive letter in Windows, and fail webhdfs related test 
> cases
> ---
>
> Key: HDFS-13296
> URL: https://issues.apache.org/jira/browse/HDFS-13296
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13296.000.patch
>
>
> In GenericTestUtils#getRandomizedTestDir, getAbsoluteFile is called and will 
> added drive letter to the path in windows, some test cases use the generated 
> path to send webhdfs request, which will fail due to the drive letter in the 
> URI like: "webhdfs://127.0.0.1:18334/D:/target/test/data/vUqZkOrBZa/test"
> GenericTestUtils#getTempPath has the similar issue in Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-03-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401256#comment-16401256
 ] 

Chris Douglas commented on HDFS-13265:
--

Excellent, thanks [~xkrogen]. Skimming the commit log, does HADOOP-13597 mean 
we should not backport HDFS-15311 to branch-2 before committing this?

> MiniDFSCluster should set reasonable defaults to reduce resource consumption
> 
>
> Key: HDFS-13265
> URL: https://issues.apache.org/jira/browse/HDFS-13265
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, test
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13265-branch-2.000.patch, 
> HDFS-13265-branch-2.000.patch, HDFS-13265.000.patch, 
> TestMiniDFSClusterThreads.java
>
>
> MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many 
> of these are not suitable for a unit test environment. For example, the 
> default handler thread count of 10 is definitely more than necessary for 
> (almost?) any unit test. We should set reasonable, lower defaults unless a 
> test specifically requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11600) Refactor TestDFSStripedOutputStreamWithFailure test classes

2018-03-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401207#comment-16401207
 ] 

Chris Douglas commented on HDFS-11600:
--

Thanks, [~Sammi]!

> Refactor TestDFSStripedOutputStreamWithFailure test classes
> ---
>
> Key: HDFS-11600
> URL: https://issues.apache.org/jira/browse/HDFS-11600
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 3.0.0-alpha2
>Reporter: Andrew Wang
>Assignee: SammiChen
>Priority: Minor
> Fix For: 3.1.0, 3.0.2
>
> Attachments: HDFS-11600-1.patch, HDFS-11600.002.patch, 
> HDFS-11600.003.patch, HDFS-11600.004.patch, HDFS-11600.005.patch, 
> HDFS-11600.006.patch, HDFS-11600.007.patch
>
>
> TestDFSStripedOutputStreamWithFailure has a great number of subclasses. The 
> tests are parameterized based on the name of these subclasses.
> Seems like we could parameterize these tests with JUnit and then not need all 
> these separate test classes.
> Another note, the tests will randomly return instead of running the test. 
> Using {{Assume}} instead would make it more clear in the test output that 
> these tests were skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13232) RBF: ConnectionPool should return first usable connection

2018-03-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401200#comment-16401200
 ] 

Chris Douglas commented on HDFS-13232:
--

bq. I committed HDFS-13230 but I messed up the message and committed it as 
HDFS-13232, any action here?
Sorry for the delay. Other than reverting and recommitting, no. We'd need to 
file a ticket with INFRA to unlock the branch and rewrite history. Probably not 
worth it.

> RBF: ConnectionPool should return first usable connection
> -
>
> Key: HDFS-13232
> URL: https://issues.apache.org/jira/browse/HDFS-13232
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Ekanth S
>Priority: Minor
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.2, 3.2.0
>
> Attachments: HDFS-13232.001.patch, HDFS-13232.002.patch, 
> HDFS-13232.003.patch
>
>
> In current ConnectionPool.getConnection(), it will return the first active 
> connection:
> {code:java}
> for (int i=0; i   int index = (threadIndex + i) % size;
>   conn = tmpConnections.get(index);
>   if (conn != null && !conn.isUsable()) {
> return conn;
>   }
> }
> {code}
> Here "!conn.isUsable()" should be "conn.isUsable()".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12422) Replace DataNode in Pipeline when waiting for Last Packet fails

2018-03-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401170#comment-16401170
 ] 

Chris Douglas commented on HDFS-12422:
--

bq. do you know anybody fit for reviewing this?
[~shv], if he has cycles.

> Replace DataNode in Pipeline when waiting for Last Packet fails
> ---
>
> Key: HDFS-12422
> URL: https://issues.apache.org/jira/browse/HDFS-12422
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: hdfs
> Attachments: HDFS-12422.001.patch, HDFS-12422.002.patch
>
>
> # Create a file with replicationFactor = 4, minReplicas = 2
> # Fail waiting for the last packet, followed by 2 exceptions when recovering 
> the leftover pipeline
> # The leftover pipeline will only have one DN and NN will never close such 
> block, resulting in failure to write
> The block will stay there forever, unable to be replicated, ultimately going 
> missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12780) Fix spelling mistake in DistCpUtils.java

2018-03-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397468#comment-16397468
 ] 

Chris Douglas commented on HDFS-12780:
--

Failure is unrelated.
{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-deploy-plugin:2.8.1:deploy (default-deploy) on 
project hadoop-cloud-storage-project: Failed to deploy artifacts: Could not 
transfer artifact 
org.apache.hadoop:hadoop-yarn-server-timeline-pluginstorage:jar:3.2.0-20180313.183440-186
 from/to apache.snapshots.https 
(https://repository.apache.org/content/repositories/snapshots): 
repository.apache.org: No address associated with hostname -> [Help 1]
{noformat}

> Fix spelling mistake in DistCpUtils.java
> 
>
> Key: HDFS-12780
> URL: https://issues.apache.org/jira/browse/HDFS-12780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
>  Labels: patch
> Fix For: 3.2.0
>
> Attachments: HDFS-12780.patch
>
>
> We found a spelling mistake in DistCpUtils.java.  "* If checksums's can't be 
> retrieved," should be " * If checksums can't be retrieved,"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12780) Fix spelling mistake in DistCpUtils.java

2018-03-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-12780:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

+1 I committed this. Thanks, [~jiangjianfei]

> Fix spelling mistake in DistCpUtils.java
> 
>
> Key: HDFS-12780
> URL: https://issues.apache.org/jira/browse/HDFS-12780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
>  Labels: patch
> Fix For: 3.2.0
>
> Attachments: HDFS-12780.patch
>
>
> We found a spelling mistake in DistCpUtils.java.  "* If checksums's can't be 
> retrieved," should be " * If checksums can't be retrieved,"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13273) Fix compilation issue in trunk

2018-03-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397356#comment-16397356
 ] 

Chris Douglas commented on HDFS-13273:
--

Er... sorry, mistook this for HDFS-12677

> Fix compilation issue in trunk
> --
>
> Key: HDFS-13273
> URL: https://issues.apache.org/jira/browse/HDFS-13273
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13273.00.patch
>
>
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Hadoop-trunk-Commit/source/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java:[40,8]
>  class TestFileStatusWithDefaultECPolicy is public, should be declared in a 
> file named TestFileStatusWithDefaultECPolicy.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13273) Fix compilation issue in trunk

2018-03-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397336#comment-16397336
 ] 

Chris Douglas commented on HDFS-13273:
--

+1 Looks like my mistake applying the patch. I ran the tests prior to commit...

> Fix compilation issue in trunk
> --
>
> Key: HDFS-13273
> URL: https://issues.apache.org/jira/browse/HDFS-13273
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13273.00.patch
>
>
> [ERROR] 
> /home/jenkins/jenkins-slave/workspace/Hadoop-trunk-Commit/source/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java:[40,8]
>  class TestFileStatusWithDefaultECPolicy is public, should be declared in a 
> file named TestFileStatusWithDefaultECPolicy.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13271) WebHDFS: Add constructor in SnapshottableDirectoryStatus with HdfsFileStatus as argument

2018-03-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-13271:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

+1 I committed this. Thanks, [~ljain]

> WebHDFS: Add constructor in SnapshottableDirectoryStatus with HdfsFileStatus 
> as argument
> 
>
> Key: HDFS-13271
> URL: https://issues.apache.org/jira/browse/HDFS-13271
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13271.001.patch, HDFS-13271.002.patch
>
>
> This jira aims to add a constructor in SnapshottableDirectoryStatus which 
> takes HdfsFileStatus as a argument. This constructor will be used in 
> JsonUtilClient#toSnapshottableDirectoryStatus for creating a 
> SnapshottableDirectoryStatus object.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13141) WebHDFS: Add support for getting snasphottable directory list

2018-03-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396507#comment-16396507
 ] 

Chris Douglas commented on HDFS-13141:
--

bq. After check HdfsFileStatus, it seems we don't have a getter for flags for 
HdfsFileStatus. Is this something we miss from HDFS-12681?
It was deliberate. The flags support existing methods on {{HdfsFileStatus}} and 
{{FileStatus}}.

Conversion between the enums directly through the type implies a tighter 
binding than I hoped we'd maintain going forward. While {{FileStatus}} should 
expose only metadata common to most implementations, {{HdfsFileStatus}} can 
specialize on HDFS.

Instead of extracting fields from one {{HdfsFileStatus}} to construct another, 
could {{SnapshottableDirectoryStatus}} just take the original struct, to avoid 
the {{convert}} method?

> WebHDFS: Add support for getting snasphottable directory list
> -
>
> Key: HDFS-13141
> URL: https://issues.apache.org/jira/browse/HDFS-13141
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 3.1.0, 3.2.0
>
> Attachments: HDFS-13141.001.patch, HDFS-13141.002.patch, 
> HDFS-13141.003.patch
>
>
> This Jira aims to implement get snapshottable directory list operation for 
> webHdfs filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12677) Extend TestReconstructStripedFile with a random EC policy

2018-03-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-12677:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

I committed this. Thanks [~tasanuma0829] for the patch and [~ekanth] for the 
revision.

[~ekanth] I apologize, the commit message should have cited both of you.

> Extend TestReconstructStripedFile with a random EC policy
> -
>
> Key: HDFS-12677
> URL: https://issues.apache.org/jira/browse/HDFS-12677
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-12677.002.patch, HDFS-12677.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10388) DistributedFileSystem#getStatus returns a misleading FsStatus

2018-03-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-10388:
-
Resolution: Invalid
Status: Resolved  (was: Patch Available)

This appears to be working as designed, since HDFS doesn't support partitions. 
I'd expect this is also correct in RBF scenarios. [~elgoiri], do you want to 
check? [~linyiqun], please reopen if this remains an issue

> DistributedFileSystem#getStatus returns a misleading FsStatus
> -
>
> Key: HDFS-10388
> URL: https://issues.apache.org/jira/browse/HDFS-10388
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-10388.001.patch
>
>
> The method {{DistributedFileSystem#getStatus}} returns the dfs's disk status.
> {code}
>public FsStatus getStatus(Path p) throws IOException {
>  statistics.incrementReadOps(1);
> return dfs.getDiskStatus();
>}
> {code}
> So the param path is no meaning here. And the object returned will mislead 
> for users to use this method. I looked into the code, only when the file 
> system has multiple partitions, the use and capacity of the partition pointed 
> to by the specified path will be reflected. For example, in the subclass 
> {{RawLocalFileSystem}}, it will be return correctly.
> We should return a new meaningless FsStatus here (like new FsStatus(0, 0, 0)) 
> and indicate that the invoked method isn't available in 
> {{DistributedFileSystem}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10923) Make InstrumentedLock require ReentrantLock

2018-03-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-10923:
-
Status: Open  (was: Patch Available)

This no longer applies. [~arpitagarwal], could you regenerate the patch?

> Make InstrumentedLock require ReentrantLock
> ---
>
> Key: HDFS-10923
> URL: https://issues.apache.org/jira/browse/HDFS-10923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Attachments: HDFS-10923.01.patch, HDFS-10923.02.patch, 
> HDFS-10923.03.patch
>
>
> Make InstrumentedLock use ReentrantLock instead of Lock, so nested 
> acquire/release calls can be instrumented correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11168) Bump Netty 4 version

2018-03-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-11168:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Resolving as duplicate, but the lineage is confusing. HDFS-11498 dropped the 
version from 4.1.0.Beta5 to 4.0.23.Final, then it was bumped to 4.0.52.Final in 
HDFS-14910. If we revisit the version of netty we're shipping, it should 
probably be in a new JIRA

> Bump Netty 4 version
> 
>
> Key: HDFS-11168
> URL: https://issues.apache.org/jira/browse/HDFS-11168
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Major
> Attachments: h11168_20161122.patch, h11168_20161123.patch, 
> h11168_20161123b.patch
>
>
> The current Netty 4 version is 4.1.0.Beta5.  We should bump it to a non-beta 
> version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-13216) HDFS

2018-03-02 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas resolved HDFS-13216.
--
Resolution: Invalid

> HDFS
> 
>
> Key: HDFS-13216
> URL: https://issues.apache.org/jira/browse/HDFS-13216
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: mster
>Priority: Trivial
>  Labels: INodesInPath
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-13216) HDFS

2018-03-02 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas reopened HDFS-13216:
--

> HDFS
> 
>
> Key: HDFS-13216
> URL: https://issues.apache.org/jira/browse/HDFS-13216
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: mster
>Priority: Trivial
>  Labels: INodesInPath
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13215) RBF: Move Router to its own module

2018-03-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382888#comment-16382888
 ] 

Chris Douglas commented on HDFS-13215:
--

Breaking server-side code into multiple modules is less useful. IIRC, RBF is 
mostly in its own package, and since we upgraded surefire it's easier to run 
subsets of tests using its regex syntax. But sure, it's mostly stand-alone, and 
might as well live in its own package. If tools like the rebalancer take a 
dependency on this package to customize their behavior in federated 
environments, those may have to move into a separate package to avoid the 
circular dependency.

> RBF: Move Router to its own module
> --
>
> Key: HDFS-13215
> URL: https://issues.apache.org/jira/browse/HDFS-13215
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> We are splitting the HDFS client code base and potentially Router-based 
> Federation is also independent enough to be in its own package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13119) RBF: Manage unavailable clusters

2018-02-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370517#comment-16370517
 ] 

Chris Douglas commented on HDFS-13119:
--

bq. As this is technically a bug, I'd like to push it for 2.9.1 and 3.0.1 (or 
3.0.2).
There's a vote for 3.0.1 in progress, but you can contact the release manager 
([~eddyxu]) in case he rolls another RC.

bq. Any idea what's the current state with the branches? My guess is branch-2.9 
and branch-3.0.
AFAIK:
trunk -> 3.2
3.1.0 -> branch-3.1
3.0.2 -> branch-3.0
3.0.1 -> branch-3.0.1
2.10 -> branch-2
2.9.1 -> branch-2.9

> RBF: Manage unavailable clusters
> 
>
> Key: HDFS-13119
> URL: https://issues.apache.org/jira/browse/HDFS-13119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
>  Labels: RBF
> Fix For: 3.1.0, 2.10.0, 3.2.0
>
> Attachments: HDFS-13119.001.patch, HDFS-13119.002.patch, 
> HDFS-13119.003.patch, HDFS-13119.004.patch, HDFS-13119.005.patch, 
> HDFS-13119.006.patch
>
>
> When a federated cluster has one of the subcluster down, operations that run 
> in every subcluster ({{RouterRpcClient#invokeAll()}}) may take all the RPC 
> connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-09 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359032#comment-16359032
 ] 

Chris Douglas commented on HDFS-13123:
--

bq. hard linking across block pools as one option and even tiered storage
Yes, [~virajith] and I wrote a prototype of this with an intern for a similar 
project. Making it a proper transaction is complicated, but architecturally RBF 
is in the right place to coordinate this cleanly.

We added APIs to generate and attach an FSImage for a NN subtree. Attaching an 
image required reallocating not only the inodeIds but also the blockIds, which 
were hardlinked into a contiguous range in the destination blockId space. We 
didn't solve all the failover and edit log cases, but these seem tractable as 
long as the subtree is immutable. Without that assumption (which RBF can 
enforce/detect) thar be dragons.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13099) RBF: Refactor RBF relevant settings

2018-02-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351201#comment-16351201
 ] 

Chris Douglas commented on HDFS-13099:
--

It's not my favorite pattern. That said, 1) it's a project convention, 2) it's 
a consistent way to handle deprecation, and 3) unit tests aid consistency with 
{{-default.xml}} config files. If this is purely for documentation then it 
could assign the fields in {{DFSConfigKeys}}/{{DFSRouterKeys}} (or whatever) 
using the fields scoped to each class. That said, IIRC javadoc reorders the 
fields, so its virtues as documentation are pretty limited unless your field 
names follow a schema.

I'm ambivalent. The documentation for RBF and the impl docs are probably the 
better references, so I'd be tempted to leave it as-is.

> RBF: Refactor RBF relevant settings
> ---
>
> Key: HDFS-13099
> URL: https://issues.apache.org/jira/browse/HDFS-13099
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
>
> Currently the State Store Driver relevant settings only written in its 
> implement classes.
> {noformat}
> public class StateStoreZooKeeperImpl extends StateStoreSerializableImpl {
> ...
>   /** Configuration keys. */
>   public static final String FEDERATION_STORE_ZK_DRIVER_PREFIX =
>   DFSConfigKeys.FEDERATION_STORE_PREFIX + "driver.zk.";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH =
>   FEDERATION_STORE_ZK_DRIVER_PREFIX + "parent-path";
>   public static final String FEDERATION_STORE_ZK_PARENT_PATH_DEFAULT =
>   "/hdfs-federation";
> ..
> {noformat}
> Actually, they should be moved into class {{DFSConfigKeys}} and documented in 
> file {{hdfs-default.xml}}. This will help more users know these settings and 
> know how to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329625#comment-16329625
 ] 

Chris Douglas commented on HDFS-12990:
--

bq. I believe a vote would make people think about this problem and respond 
according to their preferences
That would involve creating a 3.0.1 release with this change and voting on it.

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328025#comment-16328025
 ] 

Chris Douglas commented on HDFS-12990:
--

[~eyang], that paragraph begs the question. Paraphrasing: "we must retain the 
change, because we can only work around the consequences." In every example you 
cite, the project and users got some benefit. Reverting this change _is_ an 
admissible solution to fix the bugs it caused.

Our releases solve problems for our users. We're not here to curate a version 
control system and a set of compatibility rules.

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325458#comment-16325458
 ] 

Chris Douglas commented on HDFS-12990:
--

We wrote the compatibility guidelines _to avoid breaking users_. If there was 
any benefit to this change, then we could discuss tradeoffs, but there are 
none. By retaining this change, we choose to add a bunch of tedious, rote 
repairs to sites trying to upgrade from 2.x. It's not challenging to work 
around this, but it's annoying and avoidable.

bq. Maybe we should look at NN RPC port as a new feature to ensure the down 
stream projects really tested with Hadoop 3 instead of gambling on 
compatibility.
That's a creative way to look at it, but changing the NN port doesn't achieve 
that in any meaningful sense. Even if this does require changes, they are 
superficial. More to the point, we have *no* interest in second-guessing other 
projects' testing practices, or challenging whether they "really certified with 
Hadoop 3". That's not merely outside our charter, it is antithetical to it.

bq. This is greatest challenge to Hadoop policy.
bq. I know that forever compatibility is not sustainable and it only cost more 
in the long run. We should move on and do something better for Hadoop.
Are we looking at the same JIRA? This is not like moving to YARN, which 
required a lot of work to shake out its incompatible changes, but we won a 
better architecture by it. This upends a 10+ year convention, the project and 
its users gain _nothing_ by it, and it is trivial to fix.

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324589#comment-16324589
 ] 

Chris Douglas commented on HDFS-12990:
--

bq. Can we retract Hadoop 3.0.0 release and restore NN port to 8020 that we 
made a bad release? Can this be an alternate approach to produce Hadoop 3.0.1?
No, that's not available.

bq. why not fixing it in 4.0.0 and declare 3.x as dead?
bq. Are there bugs in our current release procedure?
bq. This is a good lesson to avoid time driven release for major version change.
Nah. Even if 3.x were "time driven", [~vinodkv] 
[predicted|https://s.apache.org/KnCL] that incompatible changes would require 
incompatible fixes, and of course it happened. It _always_ happens, because we 
don't control when others try out the release. We don't know how people are 
using and extending Hadoop, so we curate compatibility guidelines and 
specifications to signal what's intended so applications avoid relying on 
incidental behavior. But our guidelines are no more a contract than they are a 
suicide pact.

bq. Are we going to have other incompatible changes in the future minor and dot 
releases? What is the criteria to decide which incompatible changes are allowed?
I'm sure there are more of these, and we'll remember this issue fondly for 
having such a straightforward fix. When an incompatible change is a mistake, 
it's a sunk cost. From there, we find the least harmful way to address it and 
move on.

We created compatibility guidelines to prioritize users' needs over development 
expedience. Reverting the NN port change is consistent with that intent. We 
failed to follow our own policy when this change was committed, and absent a 
technical justification for this incompatible change, it is a bug. The current 
patch restores compatibility.

This needs to converge. Absent a technical justification for the original 
change or the veto, and without any alternative approach, the current patch is 
the best remedy we have. +1

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-01-11 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322838#comment-16322838
 ] 

Chris Douglas commented on HDFS-12990:
--

bq. No incompatible changes are allowed between 3.0.0 and 3.0.1. Dot releases 
only allow bug fixes. Won't you agree?

Not to get too bogged down in semantics, but changing the NN port from 8020 to 
9820 breaks downstream users for no benefit. We require some technical 
justification for breaking compatibility. If the new port were part of the 
contiguous range, or if the original port was in the ephemeral range, then 
arguments in favor of retaining this change would have technical merit. As this 
change lacks any justification, the original change was not a valid, 
incompatible change; it *is* a bug.

Semantics aside, as [~daryn] pointed out on the dev list, this broke rolling 
upgrades. It broke applications. It broke deployments. We don't have to burn a 
major version to fix it, so let's just... fix it.

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size

2018-01-09 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319336#comment-16319336
 ] 

Chris Douglas commented on HDFS-12802:
--

+1 thanks [~elgoiri]

> RBF: Control MountTableResolver cache size
> --
>
> Key: HDFS-12802
> URL: https://issues.apache.org/jira/browse/HDFS-12802
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, 
> HDFS-12802.002.patch, HDFS-12802.003.patch, HDFS-12802.004.patch, 
> HDFS-12802.005.patch
>
>
> Currently, the {{MountTableResolver}} caches the resolutions for the 
> {{PathLocation}}. However, this cache can grow with no limits if there are a 
> lot of unique paths. Some of these cached resolutions might not be used at 
> all.
> The {{MountTableResolver}} should clean the {{locationCache}} periodically.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size

2018-01-08 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317200#comment-16317200
 ] 

Chris Douglas commented on HDFS-12802:
--

bq. We have a read/write lock for syncing the tree and the locationCache so we 
are not taking much advantage of it.
bq. The issue is actually the amount of proxied operations for unique files.
It sounds like the performance isn't likely to be critical and concurrency is 
limited, so let's use whatever's simplest to implement and maintain.

bq. Another thing, is that we may not even need the cache time limit, the size 
covers the use case.
+1 this could use soft references in a {{ReferenceMap}}, which will purge 
entries in proportion to memory pressure. This could be simpler than adding a 
config knob for the size of the table. On the other hand, limiting the size of 
the table will also bound the cost of invalidations, and shrinking the heap to 
effect this (or resorting to esoteric gc knobs) is clearly worse.

Again: probably not going to come up in practice, so whatever's simplest to 
maintain.

bq. I don't think I can use Collection::removeIf on top of the guava cache; I 
would go for iterating over and remove the keys with the prefix
I think it'd work, since modifications to the map returned by {{Cache::asMap}} 
directly affect the cache 
[[1|https://google.github.io/guava/releases/16.0/api/docs/com/google/common/cache/Cache.html#asMap()]],
 but it's just syntactic sugar. Anything that completes in one pass is fine.

> RBF: Control MountTableResolver cache size
> --
>
> Key: HDFS-12802
> URL: https://issues.apache.org/jira/browse/HDFS-12802
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, 
> HDFS-12802.002.patch, HDFS-12802.003.patch
>
>
> Currently, the {{MountTableResolver}} caches the resolutions for the 
> {{PathLocation}}. However, this cache can grow with no limits if there are a 
> lot of unique paths. Some of these cached resolutions might not be used at 
> all.
> The {{MountTableResolver}} should clean the {{locationCache}} periodically.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12992) Ozone: Add support for Larger key sizes in Corona

2018-01-08 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317116#comment-16317116
 ] 

Chris Douglas commented on HDFS-12992:
--

Corona is a benchmark for Ozone? Could we pick a different name for this? It 
collides with an [existing 
project|https://www.facebook.com/notes/facebook-engineering/under-the-hood-scheduling-mapreduce-jobs-more-efficiently-with-corona/10151142560538920/].

> Ozone: Add support for Larger key sizes in Corona
> -
>
> Key: HDFS-12992
> URL: https://issues.apache.org/jira/browse/HDFS-12992
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
>
> Corona provides an option to specify the key size for generating the data for 
> the key. This data is allocated in a single buffer. Allocating larger key 
> sizes can cause OOM as there is not sufficient memory to allocate the buffer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12802) RBF: Control MountTableResolver cache size

2018-01-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313902#comment-16313902
 ] 

Chris Douglas commented on HDFS-12802:
--

How frequently are entries invalidated and how large is the map? If the cleanup 
is only necessary after running for several weeks but invalidation (including 
successors) is relatively common, then the Guava implementation may not match 
the workload. Apache Commons has a {{ReferenceMap}} implementation that can use 
soft references; since router memory utilization grows slowly, that would fit 
this case. Unfortunately, it's not sorted (IIRC) so invalidation would still be 
expensive.

If this does use the Guava {{Cache}}, sorting the keyset to extract the submap 
is more expensive than removing entries by a full scan. It could use 
{{Collection::removeIf}} to match a prefix of the keyset. We've been burned by 
Guava changing incompatibly in the past, but the {{Cache}} library has been 
relatively stable(?). The Guava impl is also threadsafe, so concurrent 
invalidations would not form a convoy.

> RBF: Control MountTableResolver cache size
> --
>
> Key: HDFS-12802
> URL: https://issues.apache.org/jira/browse/HDFS-12802
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Attachments: HDFS-12802.000.patch, HDFS-12802.001.patch, 
> HDFS-12802.002.patch, HDFS-12802.003.patch
>
>
> Currently, the {{MountTableResolver}} caches the resolutions for the 
> {{PathLocation}}. However, this cache can grow with no limits if there are a 
> lot of unique paths. Some of these cached resolutions might not be used at 
> all.
> The {{MountTableResolver}} should clean the {{locationCache}} periodically.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308948#comment-16308948
 ] 

Chris Douglas commented on HDFS-12911:
--

A few questions about the Context API in [^HDFS-12911-HDFS-10285-02.patch]. In 
general, a few methods might not be cleanly implementable outside the NN:
* {{readLockNS}}/{{readUnlockNS}} are sensitive APIs to expose to an external 
service. If the SPS dies, stalls, or loses connectivity to the NN without 
releasing the lock: what happens? Instead, would it be possible to break up 
{{analyseBlocksStorageMovementsAndAssignToDN}} into operations that- if any 
fail due to concurrent modification- could abort and retry? Alternatively (and 
as in the naive [^HDFS-12911.00.patch]), the server side of the RPC could be 
"heavier", by taking the lock and running this code server-side. This would 
counter many of the advantages to running it externally, unfortunately.
* {{getSPSEngine}} may leak a heavier abstraction than is necessary. As used, 
it appears to check for liveness of the remote service, which can be a single 
call (i.e., instead of checking that both the SPS and namesystem are running, 
create a call that checks both in the context object), or it's used to get a 
reference to another component. The context could return those components 
directly, rather than returning the SPS. It would be cleaner still if the 
underlying action could be behind the context API, so the caller isn't managing 
multiple abstractions.
* The {{BlockMoveTaskHandler}} and {{FileIDCollector}} abstract the "write" 
(scheduling moves) and "read" (scan namespace) interfaces? 
* {{addSPSHint}} has no callers in this patch? Is this future functionality, or 
only here for symmetry with {{removeSPSHint}}?
* Are these calls idempotent? Or would an external provider need to resync with 
the NN on RPC timeouts?
* Since they're stubs in this patch, could we move the external implementation 
to a separate JIRA? Just to keep the patch smaller.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.

2018-01-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308907#comment-16308907
 ] 

Chris Douglas commented on HDFS-12970:
--

No worries, thanks for bringing it up. The historical baggage around these APIs 
has created some unintuitive, and occasionally bizarre patterns.

> HdfsFileStatus#getPath returning null.
> --
>
> Key: HDFS-12970
> URL: https://issues.apache.org/jira/browse/HDFS-12970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> After HDFS-12681, HdfsFileStatus#getPath() returns null.
> I don't think this is expected.
> Both the implementation of {{HdfsFileStatus}} sets the path to null.
> {code:title=HdfsNamedFileStatus.java|borderStyle=solid}
>   HdfsNamedFileStatus(long length, boolean isdir, int replication,
>   long blocksize, long mtime, long atime,
>   FsPermission permission, Set flags,
>   String owner, String group,
>   byte[] symlink, byte[] path, long fileId,
>   int childrenNum, FileEncryptionInfo feInfo,
>   byte storagePolicy, ErasureCodingPolicy ecPolicy) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null,   -- The last null is for path.
> HdfsFileStatus.convert(flags));
> {code}
> {code:title=HdfsLocatedFileStatus.java|borderStyle=solid}
>   HdfsLocatedFileStatus(long length, boolean isdir, int replication,
> long blocksize, long mtime, long atime,
> FsPermission permission, EnumSet flags,
> String owner, String group,
> byte[] symlink, byte[] path, long fileId,
> int childrenNum, FileEncryptionInfo feInfo,
> byte storagePolicy, ErasureCodingPolicy ecPolicy,
> LocatedBlocks hdfsloc) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null, HdfsFileStatus.convert(flags),  -- The last 
> null on this line is for path.
> null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12970) HdfsFileStatus#getPath returning null.

2018-01-02 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas resolved HDFS-12970.
--
Resolution: Invalid

> HdfsFileStatus#getPath returning null.
> --
>
> Key: HDFS-12970
> URL: https://issues.apache.org/jira/browse/HDFS-12970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> After HDFS-12681, HdfsFileStatus#getPath() returns null.
> I don't think this is expected.
> Both the implementation of {{HdfsFileStatus}} sets the path to null.
> {code:title=HdfsNamedFileStatus.java|borderStyle=solid}
>   HdfsNamedFileStatus(long length, boolean isdir, int replication,
>   long blocksize, long mtime, long atime,
>   FsPermission permission, Set flags,
>   String owner, String group,
>   byte[] symlink, byte[] path, long fileId,
>   int childrenNum, FileEncryptionInfo feInfo,
>   byte storagePolicy, ErasureCodingPolicy ecPolicy) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null,   -- The last null is for path.
> HdfsFileStatus.convert(flags));
> {code}
> {code:title=HdfsLocatedFileStatus.java|borderStyle=solid}
>   HdfsLocatedFileStatus(long length, boolean isdir, int replication,
> long blocksize, long mtime, long atime,
> FsPermission permission, EnumSet flags,
> String owner, String group,
> byte[] symlink, byte[] path, long fileId,
> int childrenNum, FileEncryptionInfo feInfo,
> byte storagePolicy, ErasureCodingPolicy ecPolicy,
> LocatedBlocks hdfsloc) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null, HdfsFileStatus.convert(flags),  -- The last 
> null on this line is for path.
> null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.

2018-01-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308752#comment-16308752
 ] 

Chris Douglas commented on HDFS-12970:
--

bq. While working on HDFS-12811, I was issuing FSNamesystem#getFileInfo call 
from NamenodeFsck and trying to extract path from HdfsFileStatus
I see. This would be {{null}} even before HDFS-12681.

bq. If I understand your previous comment correctly, I have to cast the 
returned file status to either HdfsNamedFileStatus or HdfsLocatedFileStatus 
explicitly ?
The caller needs to invoke {{makeQualified(URI defaultUri, Path parent)}} on 
the result to set the path.

> HdfsFileStatus#getPath returning null.
> --
>
> Key: HDFS-12970
> URL: https://issues.apache.org/jira/browse/HDFS-12970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> After HDFS-12681, HdfsFileStatus#getPath() returns null.
> I don't think this is expected.
> Both the implementation of {{HdfsFileStatus}} sets the path to null.
> {code:title=HdfsNamedFileStatus.java|borderStyle=solid}
>   HdfsNamedFileStatus(long length, boolean isdir, int replication,
>   long blocksize, long mtime, long atime,
>   FsPermission permission, Set flags,
>   String owner, String group,
>   byte[] symlink, byte[] path, long fileId,
>   int childrenNum, FileEncryptionInfo feInfo,
>   byte storagePolicy, ErasureCodingPolicy ecPolicy) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null,   -- The last null is for path.
> HdfsFileStatus.convert(flags));
> {code}
> {code:title=HdfsLocatedFileStatus.java|borderStyle=solid}
>   HdfsLocatedFileStatus(long length, boolean isdir, int replication,
> long blocksize, long mtime, long atime,
> FsPermission permission, EnumSet flags,
> String owner, String group,
> byte[] symlink, byte[] path, long fileId,
> int childrenNum, FileEncryptionInfo feInfo,
> byte storagePolicy, ErasureCodingPolicy ecPolicy,
> LocatedBlocks hdfsloc) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null, HdfsFileStatus.convert(flags),  -- The last 
> null on this line is for path.
> null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12970) HdfsFileStatus#getPath returning null.

2017-12-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307033#comment-16307033
 ] 

Chris Douglas commented on HDFS-12970:
--

This is by design. {{HdfsFileStatus}} needs to be qualified before it can be 
returned to the user, since the NameNode doesn't return fully qualified URIs. 
Prior to HDFS-12681, it was qualified as it was copied into a {{FileStatus}} 
object. Since {{HdfsFileStatus}} is now a subtype of {{FileStatus}}, the path 
is initialized as null in the constructor and set later by the client.

Are you observing the path is {{null}} when it's returned to the user?

> HdfsFileStatus#getPath returning null.
> --
>
> Key: HDFS-12970
> URL: https://issues.apache.org/jira/browse/HDFS-12970
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> After HDFS-12681, HdfsFileStatus#getPath() returns null.
> I don't think this is expected.
> Both the implementation of {{HdfsFileStatus}} sets the path to null.
> {code:title=HdfsNamedFileStatus.java|borderStyle=solid}
>   HdfsNamedFileStatus(long length, boolean isdir, int replication,
>   long blocksize, long mtime, long atime,
>   FsPermission permission, Set flags,
>   String owner, String group,
>   byte[] symlink, byte[] path, long fileId,
>   int childrenNum, FileEncryptionInfo feInfo,
>   byte storagePolicy, ErasureCodingPolicy ecPolicy) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null,   -- The last null is for path.
> HdfsFileStatus.convert(flags));
> {code}
> {code:title=HdfsLocatedFileStatus.java|borderStyle=solid}
>   HdfsLocatedFileStatus(long length, boolean isdir, int replication,
> long blocksize, long mtime, long atime,
> FsPermission permission, EnumSet flags,
> String owner, String group,
> byte[] symlink, byte[] path, long fileId,
> int childrenNum, FileEncryptionInfo feInfo,
> byte storagePolicy, ErasureCodingPolicy ecPolicy,
> LocatedBlocks hdfsloc) {
> super(length, isdir, replication, blocksize, mtime, atime,
> HdfsFileStatus.convert(isdir, symlink != null, permission, flags),
> owner, group, null, null, HdfsFileStatus.convert(flags),  -- The last 
> null on this line is for path.
> null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306519#comment-16306519
 ] 

Chris Douglas commented on HDFS-12915:
--

Thanks, [~eddyxu]

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Chris Douglas
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306491#comment-16306491
 ] 

Chris Douglas commented on HDFS-12915:
--

[~eddyxu], [~Sammi], could you take a look at the patch? The findbugs warning 
has been flagged in every patch build for a few weeks. Please feel free to take 
this over, but we should commit a solution soon.

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-28 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305833#comment-16305833
 ] 

Chris Douglas commented on HDFS-12915:
--

Please don't hesitate to effect a different refactoring (e.g., removing 
{{blockType}}) if you prefer it to this patch. There's no reason to rewrite the 
same utility function multiple times.

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-28 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-12915:
-
Attachment: HDFS-12915.02.patch

Added a check for replication policy set for {{STRIPED}}

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   >