[jira] [Commented] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461959#comment-16461959
 ] 

genericqa commented on HDFS-13443:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 16m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
37s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}190m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13443 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921669/HDFS-13443.007.patch |
| Optional Tests |  

[jira] [Commented] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461956#comment-16461956
 ] 

Chao Sun commented on HDFS-13286:
-

Attached patch v5. [~xkrogen]: I didn't find anything else besides those in 
your comments. There are several places where STANDBY is checked in the Yarn 
RM, but I think we should not need to change them as it is not possible to get 
into OBSERVER state in the RM.

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch, 
> HDFS-13286-HDFS-12943.005.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13286:

Attachment: HDFS-13286-HDFS-12943.005.patch

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch, 
> HDFS-13286-HDFS-12943.005.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13286:

Attachment: (was: HDFS-13286-HDFS-12943.005.patch)

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-13286:

Attachment: HDFS-13286-HDFS-12943.005.patch

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch, 
> HDFS-13286-HDFS-12943.005.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461801#comment-16461801
 ] 

Bharat Viswanadham edited comment on HDDS-16 at 5/3/18 4:22 AM:


[~msingh] I have one question, in current ozone code, CHAINED replication type 
is not supported. In future, if CHAINED will be supported, I think we might 
need pipeline information need to be sent to datanode. The client tries to 
connector leader in the pipeline, then internally in datanode we might need 
pipeline information to connect to other data nodes and replicate. 

This is my understanding, not completely sure. Please correct me if I am 
missing something here.

 


was (Author: bharatviswa):
[~msingh] I have one question, in current ozone code, CHAINED replication type 
is not supported. In future, if CHAINED will be supported, I think we might 
need pipeline information need to be sent to datanode. Client tries to 
connector leader in the pipeline, then internally in datanode we might need 
pipeline information to connect to other data nodes and replicate. 

This is my understanding, not completely sure correct. Please correct me if i 
am missing something here.

 

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations

2018-05-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461912#comment-16461912
 ] 

Chris Douglas commented on HDFS-13186:
--

Thanks for the updated patch. Minor comments still apply, but let's work out 
the high-level first.

bq. MPU is distinct from serial copy since we can upload the data out of order. 
I don't see the use case for a version that breaks this. In this case, I think 
the boiler plate is correct: try to make an MPU; if we can't then fall back to 
FileSystem::write

Fair enough. If the caller controls the partitioning, then it can also decide 
whether the copy should be serial or parallel (e.g., for small files). However, 
this also requires the caller to implement recovery paths for both cases. So a 
user of this library would need to persist their {{UploadHandle}} to abort the 
call for a MPU, and bespoke metadata to recover failed, serial copies. That's 
not unreasonable, but it means that the client will probably fall back either 
to copy-in-place/renaming to promote completed output, which (as we've seen 
with S3/WASB and HDFS) doesn't have a good, default answer. If the MPU were 
just a U, then this library could also handle promotion in a FS-efficient way.

All that said, if this is not intended as a small framework, but rather a 
common API to parallel upload machinery across S3/WASB and HDFS, then this is 
perfectly fine. To be a general framework, this would also need a partitioning 
strategy (with overrides) to generate splits, which resolve to {{PartHandle}} 
objects. Probably not worth it.

bq. So the numbering of the parts would be internal to the part handle when 
downcasting in the different types. This means we could no longer rely on the 
fact that it's just a String and would then have to introduce all the protobuf 
machinery. That's possible; just wanted to highlight it before I go away and 
implement that.
bq. the filePath needs to follow the upload around because various nodes that 
are calling put part will need to know what type of MultipartUploader to make
bq. If we need to serialize anything, I think we should have it be explicitly 
done through protobuf

As with {{PathHandle}} implementations, PB might be useful, but not required. 
But your point is well taken. Since this would be passed across multiple nodes 
that may include different versions of the impl's MPU on the classpath, that 
could create some unnecessarily complicated logic for the MPU implementer. 
Let's stick with the approach in v004.

> [PROVIDED Phase 2] Multipart Multinode uploader API + Implementations
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2018-05-02 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461902#comment-16461902
 ] 

Yiqun Lin commented on HDFS-13507:
--

Agreed, feel free to go ahead, :).

> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible
> Attachments: HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13245) RBF: State store DBMS implementation

2018-05-02 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461863#comment-16461863
 ] 

Yiqun Lin commented on HDFS-13245:
--

{quote}
Let's add TestStateStoreDisabledNameserviceStore in a separate JIRA.
{quote}
This is tracked in HDFS-13525.

> RBF: State store DBMS implementation
> 
>
> Key: HDFS-13245
> URL: https://issues.apache.org/jira/browse/HDFS-13245
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: maobaolong
>Assignee: Yiran Wu
>Priority: Major
> Attachments: HDFS-13245.001.patch, HDFS-13245.002.patch, 
> HDFS-13245.003.patch, HDFS-13245.004.patch, HDFS-13245.005.patch, 
> HDFS-13245.006.patch
>
>
> Add a DBMS implementation for the State Store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13525) RBF: Add unit test TestStateStoreDisabledNameserviceStore

2018-05-02 Thread Yiqun Lin (JIRA)
Yiqun Lin created HDFS-13525:


 Summary: RBF: Add unit test TestStateStoreDisabledNameserviceStore
 Key: HDFS-13525
 URL: https://issues.apache.org/jira/browse/HDFS-13525
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.1
Reporter: Yiqun Lin
Assignee: Yiqun Lin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread Mohammad Arshad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Arshad updated HDFS-13443:
---
Attachment: HDFS-13443.007.patch

> RBF: Update mount table cache immediately after changing (add/update/remove) 
> mount table entries.
> -
>
> Key: HDFS-13443
> URL: https://issues.apache.org/jira/browse/HDFS-13443
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13443-branch-2.001.patch, 
> HDFS-13443-branch-2.002.patch, HDFS-13443.001.patch, HDFS-13443.002.patch, 
> HDFS-13443.003.patch, HDFS-13443.004.patch, HDFS-13443.005.patch, 
> HDFS-13443.006.patch, HDFS-13443.007.patch
>
>
> Currently mount table cache is updated periodically, by default cache is 
> updated every minute. After change in mount table, user operations may still 
> use old mount table. This is bit wrong.
> To update mount table cache, maybe we can do following
>  * *Add refresh API in MountTableManager which will update mount table cache.*
>  * *When there is a change in mount table entries, router admin server can 
> update its cache and ask other routers to update their cache*. For example if 
> there are three routers R1,R2,R3 in a cluster then add mount table entry API, 
> at admin server side, will perform following sequence of action
>  ## user submit add mount table entry request on R1
>  ## R1 adds the mount table entry in state store
>  ## R1 call refresh API on R2
>  ## R1 calls refresh API on R3
>  ## R1 directly freshest its cache
>  ## Add mount table entry response send back to user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread Mohammad Arshad (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461843#comment-16461843
 ] 

Mohammad Arshad commented on HDFS-13443:


Thanks [~elgoiri] for the reviews. Fixed all above comments submitting patch 
HDFS-13443.007.patch. To fix check style issue in RouterAdmin two local 
variables changed to inline argument.


> RBF: Update mount table cache immediately after changing (add/update/remove) 
> mount table entries.
> -
>
> Key: HDFS-13443
> URL: https://issues.apache.org/jira/browse/HDFS-13443
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13443-branch-2.001.patch, 
> HDFS-13443-branch-2.002.patch, HDFS-13443.001.patch, HDFS-13443.002.patch, 
> HDFS-13443.003.patch, HDFS-13443.004.patch, HDFS-13443.005.patch, 
> HDFS-13443.006.patch
>
>
> Currently mount table cache is updated periodically, by default cache is 
> updated every minute. After change in mount table, user operations may still 
> use old mount table. This is bit wrong.
> To update mount table cache, maybe we can do following
>  * *Add refresh API in MountTableManager which will update mount table cache.*
>  * *When there is a change in mount table entries, router admin server can 
> update its cache and ask other routers to update their cache*. For example if 
> there are three routers R1,R2,R3 in a cluster then add mount table entry API, 
> at admin server side, will perform following sequence of action
>  ## user submit add mount table entry request on R1
>  ## R1 adds the mount table entry in state store
>  ## R1 call refresh API on R2
>  ## R1 calls refresh API on R3
>  ## R1 directly freshest its cache
>  ## Add mount table entry response send back to user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13524) Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13524:
---
Description: 
TestLargeBlock#testLargeBlockSize may fail with error:
{quote}
All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
 are bad. Aborting...
{quote}

Tracing back, the error is due to the stress applied to the host sending a 2GB 
block, causing write pipeline ack read timeout:
{quote}
2017-09-10 22:16:07,285 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
/127.0.0.1:57794 dest: /127.0.0.1:44968
2017-09-10 22:16:50,402 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync took 
5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
2017-09-10 22:17:54,427 [ResponseProcessor for block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:434)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at 
org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1104)
2017-09-10 22:17:54,432 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (BlockReceiver.java:receiveBlock(1000)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.io.IOException: Connection reset by peer
{quote}

Instead of raising read timeout, I suggest increasing cluster size from 
default=1 to 3, so that it has the opportunity to choose a different DN and 
retry.

Suspect this fails after HDFS-13103, in Hadoop 2.8/3.0.0-alpha1 when we 
introduced client acknowledgement read timeout.

  was:
TestLargeBlock#testLargeBlockSize may fail with error:
{quote}
All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
 are bad. Aborting...
{quote}

Tracing back, the error is due to the stress applied to the host sending a 2GB 
block, causing write pipeline ack read timeout:
{quote}
2017-09-10 22:16:07,285 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
/127.0.0.1:57794 dest: /127.0.0.1:44968
2017-09-10 22:16:50,402 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync took 
5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
2017-09-10 22:17:54,427 [ResponseProcessor for block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at 

[jira] [Updated] (HDFS-13524) Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13524:
---
Description: 
TestLargeBlock#testLargeBlockSize may fail with error:
{quote}
All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
 are bad. Aborting...
{quote}

Tracing back, the error is due to the stress applied to the host sending a 2GB 
block, causing write pipeline ack read timeout:
{quote}
2017-09-10 22:16:07,285 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
/127.0.0.1:57794 dest: /127.0.0.1:44968
2017-09-10 22:16:50,402 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync took 
5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
2017-09-10 22:17:54,427 [ResponseProcessor for block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:434)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at 
org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1104)
2017-09-10 22:17:54,432 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (BlockReceiver.java:receiveBlock(1000)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.io.IOException: Connection reset by peer
{quote}

Instead of raising read timeout, I suggest increasing cluster size from 
default=1 to 3, so that it has the opportunity to choose a different DN and 
resend.

Suspect this fails after HDFS-13103, in Hadoop 2.8/3.0.0-alpha1 when we 
introduced client acknowledgement read timeout.

> Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize
> -
>
> Key: HDFS-13524
> URL: https://issues.apache.org/jira/browse/HDFS-13524
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> TestLargeBlock#testLargeBlockSize may fail with error:
> {quote}
> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
>  are bad. Aborting...
> {quote}
> Tracing back, the error is due to the stress applied to the host sending a 
> 2GB block, causing write pipeline ack read timeout:
> {quote}
> 2017-09-10 22:16:07,285 [DataXceiver for client 
> DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
> datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
> /127.0.0.1:57794 dest: /127.0.0.1:44968
> 2017-09-10 22:16:50,402 [DataXceiver for client 
> DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
> datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync 
> took 5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
> volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
> 2017-09-10 22:17:54,427 [ResponseProcessor for block 
> 

[jira] [Updated] (HDFS-13524) Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13524:
---
Environment: (was: TestLargeBlock#testLargeBlockSize may fail with 
error:
{quote}
All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
 are bad. Aborting...
{quote}

Tracing back, the error is due to the stress applied to the host sending a 2GB 
block, causing write pipeline ack read timeout:
{quote}
2017-09-10 22:16:07,285 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
/127.0.0.1:57794 dest: /127.0.0.1:44968
2017-09-10 22:16:50,402 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync took 
5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
2017-09-10 22:17:54,427 [ResponseProcessor for block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:434)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at 
org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1104)
2017-09-10 22:17:54,432 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (BlockReceiver.java:receiveBlock(1000)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.io.IOException: Connection reset by peer
{quote}

Instead of raising read timeout, I suggest increasing cluster size from 
default=1 to 3, so that it has the opportunity to choose a different DN and 
resend.

Suspect this fails after HDFS-13103, in Hadoop 2.8/3.0.0-alpha1 when we 
introduced client acknowledgement read timeout.)

> Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize
> -
>
> Key: HDFS-13524
> URL: https://issues.apache.org/jira/browse/HDFS-13524
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13524) Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13524:
---
Affects Version/s: 2.8.0
   3.0.0-alpha1

> Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize
> -
>
> Key: HDFS-13524
> URL: https://issues.apache.org/jira/browse/HDFS-13524
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: TestLargeBlock#testLargeBlockSize may fail with error:
> {quote}
> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
>  are bad. Aborting...
> {quote}
> Tracing back, the error is due to the stress applied to the host sending a 
> 2GB block, causing write pipeline ack read timeout:
> {quote}
> 2017-09-10 22:16:07,285 [DataXceiver for client 
> DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
> datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
> /127.0.0.1:57794 dest: /127.0.0.1:44968
> 2017-09-10 22:16:50,402 [DataXceiver for client 
> DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
> datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync 
> took 5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
> volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
> 2017-09-10 22:17:54,427 [ResponseProcessor for block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
> hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for 
> channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
> local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>   at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>   at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>   at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:434)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
>   at 
> org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1104)
> 2017-09-10 22:17:54,432 [DataXceiver for client 
> DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
> datanode.DataNode (BlockReceiver.java:receiveBlock(1000)) - Exception for 
> BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
> java.io.IOException: Connection reset by peer
> {quote}
> Instead of raising read timeout, I suggest increasing cluster size from 
> default=1 to 3, so that it has the opportunity to choose a different DN and 
> resend.
> Suspect this fails after HDFS-13103, in Hadoop 2.8/3.0.0-alpha1 when we 
> introduced client acknowledgement read timeout.
>Reporter: Wei-Chiu Chuang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13524) Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

2018-05-02 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-13524:
--

 Summary: Occasional "All datanodes are bad" error in 
TestLargeBlock#testLargeBlockSize
 Key: HDFS-13524
 URL: https://issues.apache.org/jira/browse/HDFS-13524
 Project: Hadoop HDFS
  Issue Type: Bug
 Environment: TestLargeBlock#testLargeBlockSize may fail with error:
{quote}
All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:44968,DS-acddd79e-cdf1-4ac5-aac5-e804a2e61600,DISK]]
 are bad. Aborting...
{quote}

Tracing back, the error is due to the stress applied to the host sending a 2GB 
block, causing write pipeline ack read timeout:
{quote}
2017-09-10 22:16:07,285 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001 src: 
/127.0.0.1:57794 dest: /127.0.0.1:44968
2017-09-10 22:16:50,402 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] WARN  
datanode.DataNode (BlockReceiver.java:flushOrSync(434)) - Slow flushOrSync took 
5383ms (threshold=300ms), isSync:false, flushTotalNanos=5383638982ns, 
volume=file:/tmp/tmp.1oS3ZfDCwq/src/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/
2017-09-10 22:17:54,427 [ResponseProcessor for block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1214)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:57794 remote=/127.0.0.1:44968]
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:434)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at 
org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1104)
2017-09-10 22:17:54,432 [DataXceiver for client 
DFSClient_NONMAPREDUCE_998779779_9 at /127.0.0.1:57794 [Receiving block 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001]] INFO  
datanode.DataNode (BlockReceiver.java:receiveBlock(1000)) - Exception for 
BP-682118952-172.26.15.143-1505106964162:blk_1073741825_1001
java.io.IOException: Connection reset by peer
{quote}

Instead of raising read timeout, I suggest increasing cluster size from 
default=1 to 3, so that it has the opportunity to choose a different DN and 
resend.

Suspect this fails after HDFS-13103, in Hadoop 2.8/3.0.0-alpha1 when we 
introduced client acknowledgement read timeout.
Reporter: Wei-Chiu Chuang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13081) Datanode#checkSecureConfig should allow SASL and privileged HTTP

2018-05-02 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461812#comment-16461812
 ] 

Anu Engineer commented on HDFS-13081:
-

There is a lot of existing HDFS clusters where wildcard certs are used. :(
For example, some vendors document the use of Wild Card Certs. I am concerned 
that this patch does not consider that scenario, which is quite popular in the 
wild and opens up lots of existing cluster to new security threats.


> Datanode#checkSecureConfig should allow SASL and privileged HTTP
> 
>
> Key: HDFS-13081
> URL: https://issues.apache.org/jira/browse/HDFS-13081
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 3.0.0
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0, 3.0.3
>
> Attachments: HDFS-13081.000.patch, HDFS-13081.001.patch, 
> HDFS-13081.002.patch, HDFS-13081.003.patch, HDFS-13081.004.patch, 
> HDFS-13081.005.patch, HDFS-13081.006.patch
>
>
> Datanode#checkSecureConfig currently check the following to determine if 
> secure datanode is enabled. 
>  # The server has bound to privileged ports for RPC and HTTP via 
> SecureDataNodeStarter.
>  # The configuration enables SASL on DataTransferProtocol and HTTPS (no plain 
> HTTP) for the HTTP server. 
> Authentication of Datanode RPC server can be done either via SASL handshake 
> or JSVC/privilege RPC port. 
> This guarantees authentication of the datanode RPC server before a client 
> transmits a secret, such as a block access token. 
> Authentication of the  HTTP server can also be done either via HTTPS/SSL or 
> JSVC/privilege HTTP port. This guarantees authentication of datandoe HTTP 
> server before a client transmits a secret, such as a delegation token.
> This ticket is open to allow privileged HTTP as an alternative to HTTPS to 
> work with SASL based RPC protection.
>  
> cc: [~cnauroth] , [~daryn], [~jnpandey] for additional feedback.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461801#comment-16461801
 ] 

Bharat Viswanadham commented on HDDS-16:


[~msingh] I have one question, in current ozone code, CHAINED replication type 
is not supported. In future, if CHAINED will be supported, I think we might 
need pipeline information need to be sent to datanode. Client tries to 
connector leader in the pipeline, then internally in datanode we might need 
pipeline information to connect to other data nodes and replicate. 

This is my understanding, not completely sure correct. Please correct me if i 
am missing something here.

 

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13481) TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently

2018-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461788#comment-16461788
 ] 

Hudson commented on HDFS-13481:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14115 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14115/])
HDFS-13481. TestRollingFileSystemSinkWithHdfs#testFlushThread: test (templedf: 
rev 87c23ef643393c39e8353ca9f495b0c8f97cdbd9)
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/sink/RollingFileSystemSinkTestBase.java


> TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently
> -
>
> Key: HDFS-13481
> URL: https://issues.apache.org/jira/browse/HDFS-13481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.1
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.3
>
> Attachments: HDFS-13481.001.patch
>
>
> The test fails very rarely on a laptop, but very commonly during Jenkins runs.
> {noformat}
> Error Message
>   Flush thread did not run within 10 seconds
> Stacktrace
> java.lang.AssertionError: Flush thread did not run within 10 seconds
>   at 
> org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFlushThread(TestRollingFileSystemSinkWithHdfs.java:291)
> {noformat}
> According to my test, this breaks about 0.3% locally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13481) TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently

2018-05-02 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-13481:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.3
   3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, [~gabor.bota].  Committed to trunk, branch-3.1 and 
branch-3.0.  I've lost track of the branch structure lately.  Anywhere else I 
need to commit it?

> TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently
> -
>
> Key: HDFS-13481
> URL: https://issues.apache.org/jira/browse/HDFS-13481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.1
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.3
>
> Attachments: HDFS-13481.001.patch
>
>
> The test fails very rarely on a laptop, but very commonly during Jenkins runs.
> {noformat}
> Error Message
>   Flush thread did not run within 10 seconds
> Stacktrace
> java.lang.AssertionError: Flush thread did not run within 10 seconds
>   at 
> org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFlushThread(TestRollingFileSystemSinkWithHdfs.java:291)
> {noformat}
> According to my test, this breaks about 0.3% locally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation

2018-05-02 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461766#comment-16461766
 ] 

Erik Krogen commented on HDFS-13522:


Thanks for your helpful comments [~elgoiri]!
{quote}
It would be handy to have a setup of the MiniDFSCluster that sets OBSERVER NNs 
automatically and having some predefined contract that makes sure that requests 
are going to the OBSERVER. Is there something along those lines already 
available in HDFS-12943?
{quote}
Not yet but we should definitely add that. Created HDFS-13523

Just FYI, no one is planning on working on this ticket soon, just created to 
make sure it is not forgotten.

> Support observer node from Router-Based Federation
> --
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Priority: Major
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13523) Support observer nodes in MiniDFSCluster

2018-05-02 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-13523:
--

 Summary: Support observer nodes in MiniDFSCluster
 Key: HDFS-13523
 URL: https://issues.apache.org/jira/browse/HDFS-13523
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, test
Reporter: Erik Krogen


MiniDFSCluster should support Observer nodes so that we can write decent 
integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation

2018-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461763#comment-16461763
 ] 

Íñigo Goiri commented on HDFS-13522:


There are two main changes to do:
* Collect the OBSERVER state in {{NamenodeHeartbeatService}} and store it in 
the {{MembershipStore}} through the {{FederationNamenodeServiceState}}
* Allow the {{RouterRpcClient}} to pick OBSERVER NNs to perform the operations.

Both changes should be fairly easy to add.
It would be handy to have a setup of the MiniDFSCluster that sets OBSERVER NNs 
automatically and having some predefined contract that makes sure that requests 
are going to the OBSERVER. Is there something along those lines already 
available in HDFS-12943?

> Support observer node from Router-Based Federation
> --
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Priority: Major
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461761#comment-16461761
 ] 

Chao Sun commented on HDFS-13286:
-

Yes that makes sense. Will update the error message.

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461756#comment-16461756
 ] 

Erik Krogen commented on HDFS-13286:


For {{FailoverController#preFailoverChecks()}}, I agree with your logic, but we 
need to update the error message here:
{code}
if (!toSvcStatus.getState().equals(HAServiceState.STANDBY)) {
  throw new FailoverFailedException(
  "Can't failover to an active service");
}
{code}
Currently if you tried to failover to OBSERVER, you would get the message that 
you can't failover to an active service. We should update to "Can't failover to 
a service not in standby state" or something like that.

Agreed with leaving {{FederationNamenodeServiceState}} for a follow-on JIRA 
targeted at RBF support. Created HDFS-13522.

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13522) Support observer node from Router-Based Federation

2018-05-02 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-13522:
--

 Summary: Support observer node from Router-Based Federation
 Key: HDFS-13522
 URL: https://issues.apache.org/jira/browse/HDFS-13522
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: federation, namenode
Reporter: Erik Krogen


Changes will need to occur to the router to support the new observer node.

One such change will be to make the router understand the observer state, e.g. 
{{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461691#comment-16461691
 ] 

Chao Sun commented on HDFS-13286:
-

[~xkrogen] Thanks for great comments!
{quote}Take a look at BPServiceActor L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER. Can you see if you agree?
{quote}
Yes, I think this also applies to OBSERVER. Will fix.
{quote}The constructor of StandbyState needs to be modified to super(isObserver 
? HAServiceState.OBSERVER : HAServiceState.STANDBY); instead of just always 
passing STANDBY, else things which use HAState#getServiceState() will be wrong 
(see for example NameNode#getServiceStatus()).
{quote}
Good catch! Fixed.
{quote}It looks like we also need to update the enums in 
FederationNamenodeServiceState and NNHAStatusHeartbeatProto, and their 
associated usages
{quote}
Yes. I'm going to just use {{FederationNamenodeServiceState.STANDBY}} for 
{{HAServiceState.OBSERVER}} for now. Supporting Observer in RBF will be done in 
a separate JIRA. For {{NNHAStatusHeartbeatProto}}, I found I need to add a 
{{OBSERVER}} state in {{NNHAStatusHeartbeatProto}}.
{quote}We should be able to remove the TODO on NameNode L1866?
{quote}
Done.
{quote}I think this section of FailoverController#preFailoverChecks() may need 
some work: ... It seems the first if-condition is assuming there are only two 
possible states, so if the state is not STANDBY, it must be ACTIVE. I think we 
should update this to explicitly check for ACTIVE. Next, is the service is in 
OBSERVER state, isReadyToBecomeActive() will be false. In this case, 
FailoverController#preFailoverChecks() will still allow this operation if 
forceActive is true. I don't think we want to allow forceActive to attempt to 
failover an observer, right?
{quote}
Hmm... For the first if-condition, exception will be thrown if the target 
service is **not** STANDBY. This should be good for OBSERVER case, right (we 
don't want failover target to be OBSERVER)? if this is the case, then the 
following checks only apply to STANDBY, so nothing need to change.
{quote}For all three usages of FSNameystem#isInStandbyState(), it actually 
seems to me that they should apply if it is in observer or standby state, can 
you double check and if so update accordingly?
{quote}
Yes you are right. Will change that to include OBSERVER too.

Thanks again! It seems I missed a lot of places where 
{{HAServiceState.STANDBY}} are used... :-/ 
Will double check that and submit another patch.

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13429) libhdfs++ Expose a C++ logging API

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461684#comment-16461684
 ] 

genericqa commented on HDFS-13429:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  7m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
59m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  8m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 
56s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13429 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921613/HDFS-13429.000.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 1282120f8be0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b63a0a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24127/artifact/out/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24127/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24127/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++ Expose a C++ logging API
> --
>
> Key: HDFS-13429
> URL: https://issues.apache.org/jira/browse/HDFS-13429
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Critical
> Attachments: HDFS-13429.000.patch
>
>
> The libhdfs++ C API supports taking function pointers for log plugins and 
> defines levels for components and severity.  

[jira] [Resolved] (HDFS-6589) TestDistributedFileSystem.testAllWithNoXmlDefaults failed intermittently

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-6589.
---
Resolution: Cannot Reproduce

Resolve as cannot reproduce. The last time I see this bug was 2 years ago. Most 
likely it was a real bug and fixed later.

> TestDistributedFileSystem.testAllWithNoXmlDefaults failed intermittently
> 
>
> Key: HDFS-6589
> URL: https://issues.apache.org/jira/browse/HDFS-6589
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.5.0
>Reporter: Yongjun Zhang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: flaky-test
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/7207 is clean
> https://builds.apache.org/job/PreCommit-HDFS-Build/7208 has the following 
> failure. The code is essentially the same.
> Running the same test locally doesn't reproduce. A flaky test there.
> {code}
> Stacktrace
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSClient(TestDistributedFileSystem.java:263)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testAllWithNoXmlDefaults(TestDistributedFileSystem.java:651)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13521) NFS Gateway should support impersonation

2018-05-02 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-13521:
--

 Summary: NFS Gateway should support impersonation
 Key: HDFS-13521
 URL: https://issues.apache.org/jira/browse/HDFS-13521
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Wei-Chiu Chuang


Similar to HDFS-10481, NFS gateway and httpfs are independent processes that 
accept client connections.
NFS Gateway currently solves file permission/ownership problem by running as 
HDFS super user, and then call setOwner() to change file owner.

This is not desirable.
# it adds additional RPC load to NameNode. 
#  this does not support at-rest encryption, because by design, HDFS super user 
cannot access KMS.

This is yet another problem around KMS ACL. [~xiaochen] [~rushabh.shah] 
thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-6) Enable SCM kerberos auth

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461581#comment-16461581
 ] 

genericqa commented on HDDS-6:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
6s{color} | {color:red} hadoop-hdds/common in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
48s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
58s{color} | {color:red} hadoop-ozone/common in trunk has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
0s{color} | {color:red} The patch generated 4 new + 0 unchanged - 0 fixed = 4 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
36s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 18s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
14s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} 

[jira] [Commented] (HDDS-15) Add memory profiler support to Genesis

2018-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461579#comment-16461579
 ] 

Hudson commented on HDDS-15:


SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14114 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14114/])
HDDS-15. Add memory profiler support to Genesis. Contributed by Anu (aengineer: 
rev 6b63a0af9b29c231166d9af50d499a246cbbb755)
* (edit) 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/genesis/Genesis.java
* (add) 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/genesis/GenesisMemoryProfiler.java


> Add memory profiler support to Genesis
> --
>
> Key: HDDS-15
> URL: https://issues.apache.org/jira/browse/HDDS-15
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Affects Versions: 0.2.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-15.001.patch, HDDS-15.002.patch, HDDS-15.003.patch
>
>
> Add the ability to sample max memory usage when running tests under Genesis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13429) libhdfs++ Expose a C++ logging API

2018-05-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-13429:
---
Status: Patch Available  (was: Open)

> libhdfs++ Expose a C++ logging API
> --
>
> Key: HDFS-13429
> URL: https://issues.apache.org/jira/browse/HDFS-13429
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Critical
> Attachments: HDFS-13429.000.patch
>
>
> The libhdfs++ C API supports taking function pointers for log plugins and 
> defines levels for components and severity.  It'd be nice to have an 
> idiomatic C++ logging interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13399) Make Client field AlignmentContext non-static.

2018-05-02 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461569#comment-16461569
 ] 

Plamen Jeliazkov commented on HDFS-13399:
-

Hey Konstantin,

Alright I think things are cleared up now for the most part about the overall 
plan. Just a couple questions / nits:

Regarding (1), I see also you left {{DFSClient.getAlignmentContext}} in. Should 
we just remove it? I don't believe there is a path from the {{ProxyProvider}} 
back to the {{DFSClient}}.

Also please take a look at the comment I made here (just above): 
https://issues.apache.org/jira/browse/HDFS-13399?focusedCommentId=16454623=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16454623

We can discuss this in detail next meeting but there is a serious concern 
regarding the {{FSEditLogAsync}} and sending the client state over the RPC. My 
suspicion is we need to make additional changes to {{FSEditLogAsync...}}

> Make Client field AlignmentContext non-static.
> --
>
> Key: HDFS-13399
> URL: https://issues.apache.org/jira/browse/HDFS-13399
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-12943
>Reporter: Plamen Jeliazkov
>Assignee: Plamen Jeliazkov
>Priority: Major
> Attachments: HDFS-13399-HDFS-12943.000.patch, 
> HDFS-13399-HDFS-12943.001.patch, HDFS-13399-HDFS-12943.002.patch, 
> HDFS-13399-HDFS-12943.003.patch, HDFS-13399-HDFS-12943.004.patch, 
> HDFS-13399-HDFS-12943.005.patch, HDFS-13399-HDFS-12943.006.patch
>
>
> In HDFS-12977, DFSClient's constructor was altered to make use of a new 
> static method in Client that allowed one to set an AlignmentContext. This 
> work is to remove that static field and make each DFSClient pass it's 
> AlignmentContext down to the proxy Call level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-15) Add memory profiler support to Genesis

2018-05-02 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-15:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~xyao] Thanks for the reviews. I have committed this to trunk.

> Add memory profiler support to Genesis
> --
>
> Key: HDDS-15
> URL: https://issues.apache.org/jira/browse/HDDS-15
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Affects Versions: 0.2.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-15.001.patch, HDDS-15.002.patch, HDDS-15.003.patch
>
>
> Add the ability to sample max memory usage when running tests under Genesis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13520) fuse_dfs to support keytab based login

2018-05-02 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13520:
---
Description: 
It looks like the current fuse_dfs implementation supports login using current 
kerberos credential. If the tgt expires, it fails with the following error:
{noformat}
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://ns1, port=0, 
kerbTicketCachePath=/tmp/krb5cc_2000, userName=systest) error:
LoginException: Unable to obtain Principal Name for authentication 
org.apache.hadoop.security.KerberosAuthException: failure to login: for user: 
systest using ticket cache file: /tmp/krb5cc_2000 
javax.security.auth.login.LoginException: Unable to obtain Principal Name for 
authentication
at 
org.apache.hadoop.security.UserGroupInformation.getUGIFromTicketCache(UserGroupInformation.java:807)
at 
org.apache.hadoop.security.UserGroupInformation.getBestUGI(UserGroupInformation.java:742)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:404)
Caused by: javax.security.auth.login.LoginException: Unable to obtain Principal 
Name for authentication
at 
com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:841)
at 
com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at 
org.apache.hadoop.security.UserGroupInformation.getUGIFromTicketCache(UserGroupInformation.java:788)
... 2 more

{noformat}
This is reproducible easily in a test cluster with an extremely short ticket 
life time (e.g. 1 minute)

Note: HDFS-3608 addresses a similar issue, but in this case, since the ticket 
cache file itself does not change, fuse couldn't detect & update.

It looks like it should call UserGroupInformation#loginFromKeytab() in the 
beginning, similar to how balancer supports keytab based login (HDFS-9804). 
Thanks [~xiaochen] for the idea.

A quick workaround would have a crontab job that periodically renew the 
kerberos ticket with a keytab, say every 8 hours.

  was:
It looks like the current fuse_dfs implementation supports login using current 
kerberos credential. If the tgt expires, it fails with the following error:
{noformat}
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://ns1, port=0, 
kerbTicketCachePath=/tmp/krb5cc_2000, userName=systest) error:
LoginException: Unable to obtain Principal Name for authentication 
org.apache.hadoop.security.KerberosAuthException: failure to login: for user: 
systest using ticket cache file: /tmp/krb5cc_2000 
javax.security.auth.login.LoginException: Unable to obtain Principal Name for 
authentication
at 
org.apache.hadoop.security.UserGroupInformation.getUGIFromTicketCache(UserGroupInformation.java:807)
at 
org.apache.hadoop.security.UserGroupInformation.getBestUGI(UserGroupInformation.java:742)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:404)
Caused by: javax.security.auth.login.LoginException: Unable to obtain Principal 
Name for authentication
at 
com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:841)
at 
com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at 

[jira] [Created] (HDFS-13520) fuse_dfs to support keytab based login

2018-05-02 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-13520:
--

 Summary: fuse_dfs to support keytab based login
 Key: HDFS-13520
 URL: https://issues.apache.org/jira/browse/HDFS-13520
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
 Environment: Hadoop 2.6/3.0, Kerberized, fuse_dfs
Reporter: Wei-Chiu Chuang


It looks like the current fuse_dfs implementation supports login using current 
kerberos credential. If the tgt expires, it fails with the following error:
{noformat}
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://ns1, port=0, 
kerbTicketCachePath=/tmp/krb5cc_2000, userName=systest) error:
LoginException: Unable to obtain Principal Name for authentication 
org.apache.hadoop.security.KerberosAuthException: failure to login: for user: 
systest using ticket cache file: /tmp/krb5cc_2000 
javax.security.auth.login.LoginException: Unable to obtain Principal Name for 
authentication
at 
org.apache.hadoop.security.UserGroupInformation.getUGIFromTicketCache(UserGroupInformation.java:807)
at 
org.apache.hadoop.security.UserGroupInformation.getBestUGI(UserGroupInformation.java:742)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:404)
Caused by: javax.security.auth.login.LoginException: Unable to obtain Principal 
Name for authentication
at 
com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:841)
at 
com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at 
org.apache.hadoop.security.UserGroupInformation.getUGIFromTicketCache(UserGroupInformation.java:788)
... 2 more

{noformat}
This is reproducible easily in a test cluster with an extremely short ticket 
life time (e.g. 1 minute)

Note: HDFS-3608 addresses a similar issue, but in this case, since the ticket 
cache file itself does not change, fuse couldn't detect & update.

It looks like it should call UserGroupInformation#loginFromKeytab() in the 
beginning, similar to how balancer supports keytab based login (HDFS-9804). 
Thanks [~xiaochen] for the idea.

Or alternatively, have a background process that continuously relogin from 
keytab.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13481) TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently

2018-05-02 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461418#comment-16461418
 ] 

Daniel Templeton commented on HDFS-13481:
-

LGTM +1.  If there are no other comments by end of day, I'll commit it.

> TestRollingFileSystemSinkWithHdfs#testFlushThread: test failed intermittently
> -
>
> Key: HDFS-13481
> URL: https://issues.apache.org/jira/browse/HDFS-13481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.1
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13481.001.patch
>
>
> The test fails very rarely on a laptop, but very commonly during Jenkins runs.
> {noformat}
> Error Message
>   Flush thread did not run within 10 seconds
> Stacktrace
> java.lang.AssertionError: Flush thread did not run within 10 seconds
>   at 
> org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs.testFlushThread(TestRollingFileSystemSinkWithHdfs.java:291)
> {noformat}
> According to my test, this breaks about 0.3% locally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13429) libhdfs++ Expose a C++ logging API

2018-05-02 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461399#comment-16461399
 ] 

James Clampffer commented on HDFS-13429:


First patch ready for review.

Comments:
-The logger was already done in C++ but the interface didn't make it into the 
include/hdfspp directory.  Most of this patch is just making the existing code 
consistent so it could be part of the public API.
-C logging API that was in hdfspp/log.h was moved into hdfspp/hdfs_ext.h with 
the rest of the C API.  This was done to avoid confusion over which headers 
were pure C and which used C++.  The C++ logging API now lives in hdfspp/log.h.
-Added a new test to make sure constants exposed in the public API aren't 
accidentally changed.

> libhdfs++ Expose a C++ logging API
> --
>
> Key: HDFS-13429
> URL: https://issues.apache.org/jira/browse/HDFS-13429
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Critical
> Attachments: HDFS-13429.000.patch
>
>
> The libhdfs++ C API supports taking function pointers for log plugins and 
> defines levels for components and severity.  It'd be nice to have an 
> idiomatic C++ logging interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13429) libhdfs++ Expose a C++ logging API

2018-05-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-13429:
---
Attachment: HDFS-13429.000.patch

> libhdfs++ Expose a C++ logging API
> --
>
> Key: HDFS-13429
> URL: https://issues.apache.org/jira/browse/HDFS-13429
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Critical
> Attachments: HDFS-13429.000.patch
>
>
> The libhdfs++ C API supports taking function pointers for log plugins and 
> defines levels for components and severity.  It'd be nice to have an 
> idiomatic C++ logging interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461357#comment-16461357
 ] 

Íñigo Goiri commented on HDFS-13443:


Other than the check style issues this looks good.
The unit tests seem unrelated.
A couple minor nits:
* You can do {{TimeUnit.MINUTES.toMillis(5)}}, same for the 1 minute one too; 
then you can just set the hdfs-default value to be 1m and 5m.
* {{MountTableRefreshThread}} has a logigng that uses + and could use the 
replacement {}.
* For succesCount  increase you can just use ++.
Once those are fixed, +1.

I'd like somebody else to review too to make sure I'm not missing anything.

> RBF: Update mount table cache immediately after changing (add/update/remove) 
> mount table entries.
> -
>
> Key: HDFS-13443
> URL: https://issues.apache.org/jira/browse/HDFS-13443
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13443-branch-2.001.patch, 
> HDFS-13443-branch-2.002.patch, HDFS-13443.001.patch, HDFS-13443.002.patch, 
> HDFS-13443.003.patch, HDFS-13443.004.patch, HDFS-13443.005.patch, 
> HDFS-13443.006.patch
>
>
> Currently mount table cache is updated periodically, by default cache is 
> updated every minute. After change in mount table, user operations may still 
> use old mount table. This is bit wrong.
> To update mount table cache, maybe we can do following
>  * *Add refresh API in MountTableManager which will update mount table cache.*
>  * *When there is a change in mount table entries, router admin server can 
> update its cache and ask other routers to update their cache*. For example if 
> there are three routers R1,R2,R3 in a cluster then add mount table entry API, 
> at admin server side, will perform following sequence of action
>  ## user submit add mount table entry request on R1
>  ## R1 adds the mount table entry in state store
>  ## R1 call refresh API on R2
>  ## R1 calls refresh API on R3
>  ## R1 directly freshest its cache
>  ## Add mount table entry response send back to user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-10) docker changes to test secure ozone cluster

2018-05-02 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-10:
---
Summary: docker changes to test secure ozone cluster  (was: Create 
SecureMiniOzoneCluster to facilitate security related test cases in ozone)

> docker changes to test secure ozone cluster
> ---
>
> Key: HDDS-10
> URL: https://issues.apache.org/jira/browse/HDDS-10
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
>
> Create SecureMiniOzoneCluster to facilitate security related test cases in 
> ozone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-10) docker changes to test secure ozone cluster

2018-05-02 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-10:
---
Description: Update docker compose and settings to test secure ozone 
cluster.  (was: Create SecureMiniOzoneCluster to facilitate security related 
test cases in ozone)

> docker changes to test secure ozone cluster
> ---
>
> Key: HDDS-10
> URL: https://issues.apache.org/jira/browse/HDDS-10
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
>
> Update docker compose and settings to test secure ozone cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461275#comment-16461275
 ] 

Erik Krogen edited comment on HDFS-13286 at 5/2/18 5:28 PM:


Hey [~csun], sorry for taking a while to get back to you on this. Your new 
{{HAServiceState}} changes inspired me to do a more comprehensive look at how 
those states are used and I noticed a few things:
* Take a look at {{BPServiceActor}} L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to {{state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER}}. Can you see if you agree?
* The constructor of {{StandbyState}} needs to be modified to 
{{super(isObserver ? HAServiceState.OBSERVER : HAServiceState.STANDBY);}} 
instead of just always passing {{STANDBY}}, else things which use 
{{HAState#getServiceState()}} will be wrong (see for example 
{{NameNode#getServiceStatus()}}).
* It looks like we also need to update the enums in 
{{FederationNamenodeServiceState}} and {{NNHAStatusHeartbeatProto}}, and their 
associated usages
* We should be able to remove the TODO on {{NameNode}} L1866?
* I think this section of {{FailoverController#preFailoverChecks()}} may need 
some work:
{code}
if (!toSvcStatus.getState().equals(HAServiceState.STANDBY)) {
  throw new FailoverFailedException(
  "Can't failover to an active service");
}

if (!toSvcStatus.isReadyToBecomeActive()) {
  String notReadyReason = toSvcStatus.getNotReadyReason();
  if (!forceActive) {
throw new FailoverFailedException(
target + " is not ready to become active: " +
notReadyReason);
  } else {
LOG.warn("Service is not ready to become active, but forcing: {}",
notReadyReason);
  }
}
{code}
It seems the first if-condition is assuming there are only two possible states, 
so if the state is not STANDBY, it must be ACTIVE. I think we should update 
this to explicitly check for ACTIVE. Next, is the service is in OBSERVER state, 
{{isReadyToBecomeActive()}} will be false. In this case, 
{{FailoverController#preFailoverChecks()}} will still allow this operation if 
{{forceActive}} is true. I don't think we want to allow {{forceActive}} to 
attempt to failover an observer, right?
* For all three usages of {{FSNameystem#isInStandbyState()}}, it actually seems 
to me that they should apply if it is in observer or standby state, can you 
double check and if so update accordingly?


was (Author: xkrogen):
Hey [~csun], sorry for taking a while to get back to you on this. Your new 
{{HAServiceState}} changes inspired me to do a more comprehensive look at how 
those states are used and I noticed a few things:
* Take a look at {{BPServiceActor}} L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to {{state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER}}. Can you see if you agree?
* The constructor of {{StandbyState}} needs to be modified to 
{{super(isObserver ? HAServiceState.OBSERVER : HAServiceState.STANDBY);}} 
instead of just always passing {{STANDBY}}, else things which use 
{{HAState#getServiceState()}} will be wrong (see for example 
{{NameNode#getServiceStatus()}}).
* It looks like we also need to update the enums in 
{{FederationNamenodeServiceState}} and {{NNHAStatusHeartbeatProto}}, and their 
associated usages
* We should be able to remove the TODO on {{NameNode}} L1866?

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13245) RBF: State store DBMS implementation

2018-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461334#comment-16461334
 ] 

Íñigo Goiri commented on HDFS-13245:


Let's add TestStateStoreDisabledNameserviceStore in a separate JIRA.
In general, I want to make sure that if somebody adds a new store, unit tests 
will fail if they don't add it to the SQL impl.

> RBF: State store DBMS implementation
> 
>
> Key: HDFS-13245
> URL: https://issues.apache.org/jira/browse/HDFS-13245
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: maobaolong
>Assignee: Yiran Wu
>Priority: Major
> Attachments: HDFS-13245.001.patch, HDFS-13245.002.patch, 
> HDFS-13245.003.patch, HDFS-13245.004.patch, HDFS-13245.005.patch, 
> HDFS-13245.006.patch
>
>
> Add a DBMS implementation for the State Store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-6) Enable SCM kerberos auth

2018-05-02 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-6:
--
Attachment: HDDS-4-HDDS-6.00.patch

> Enable SCM kerberos auth
> 
>
> Key: HDDS-6
> URL: https://issues.apache.org/jira/browse/HDDS-6
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM, Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-4-HDDS-6.00.patch
>
>
> Enable SCM kerberos auth



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-6) Enable SCM kerberos auth

2018-05-02 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-6:
--
Attachment: (was: HDDS-6.00.patch)

> Enable SCM kerberos auth
> 
>
> Key: HDDS-6
> URL: https://issues.apache.org/jira/browse/HDDS-6
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM, Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
>
> Enable SCM kerberos auth



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2018-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461317#comment-16461317
 ] 

Íñigo Goiri commented on HDFS-13507:


I would add the command usage of HDFSRouterFederation.md in a different JIRA 
and limit this one to the change that breaks compatibility.

> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible
> Attachments: HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-15) Add memory profiler support to Genesis

2018-05-02 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461310#comment-16461310
 ] 

Anu Engineer commented on HDDS-15:
--

[~xyao] Thanks for the reviews. I had added the Lic and got a clean Jenkins 
build. I will commit this patch now.

> Add memory profiler support to Genesis
> --
>
> Key: HDDS-15
> URL: https://issues.apache.org/jira/browse/HDDS-15
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Affects Versions: 0.2.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-15.001.patch, HDDS-15.002.patch, HDDS-15.003.patch
>
>
> Add the ability to sample max memory usage when running tests under Genesis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13488) RBF: Reject requests when a Router is overloaded

2018-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461297#comment-16461297
 ] 

Íñigo Goiri commented on HDFS-13488:


Thanks [~linyiqun] for committing.

> RBF: Reject requests when a Router is overloaded
> 
>
> Key: HDFS-13488
> URL: https://issues.apache.org/jira/browse/HDFS-13488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.1
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: HDFS-13488.000.patch, HDFS-13488.001.patch, 
> HDFS-13488.002.patch, HDFS-13488.003.patch, HDFS-13488.004.patch
>
>
> A Router might be overloaded when handling special cases (e.g. a slow 
> subcluster). The Router could reject the requests and the client could try 
> with another Router. We should leverage the Standby mechanism for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13286) Add haadmin commands to transition between standby and observer

2018-05-02 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461275#comment-16461275
 ] 

Erik Krogen commented on HDFS-13286:


Hey [~csun], sorry for taking a while to get back to you on this. Your new 
{{HAServiceState}} changes inspired me to do a more comprehensive look at how 
those states are used and I noticed a few things:
* Take a look at {{BPServiceActor}} L905-909. I haven't looked at HDFS-9917 
yet, but judging from the comment it seems we probably want to change this 
if-condition to {{state == HAServiceState.STANDBY || state == 
HAServiceState.OBSERVER}}. Can you see if you agree?
* The constructor of {{StandbyState}} needs to be modified to 
{{super(isObserver ? HAServiceState.OBSERVER : HAServiceState.STANDBY);}} 
instead of just always passing {{STANDBY}}, else things which use 
{{HAState#getServiceState()}} will be wrong (see for example 
{{NameNode#getServiceStatus()}}).
* It looks like we also need to update the enums in 
{{FederationNamenodeServiceState}} and {{NNHAStatusHeartbeatProto}}, and their 
associated usages
* We should be able to remove the TODO on {{NameNode}} L1866?

> Add haadmin commands to transition between standby and observer
> ---
>
> Key: HDFS-13286
> URL: https://issues.apache.org/jira/browse/HDFS-13286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13286-HDFS-12943.000.patch, 
> HDFS-13286-HDFS-12943.001.patch, HDFS-13286-HDFS-12943.002.patch, 
> HDFS-13286-HDFS-12943.003.patch, HDFS-13286-HDFS-12943.004.patch
>
>
> As discussed in HDFS-12975, we should allow explicit transition between 
> standby and observer through haadmin command, such as:
> {code}
> haadmin -transitionToObserver
> {code}
> Initially we should support transition from observer to standby, and standby 
> to observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11807) libhdfs++: Get minidfscluster tests running under valgrind

2018-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461243#comment-16461243
 ] 

Hudson commented on HDFS-11807:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14110 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14110/])
HDFS-11807. libhdfs++: Get minidfscluster tests running under valgrind.  (jhc: 
rev 19ae588fde9930c042cdb2848b8a1a0ff514b575)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/CMakeLists.txt
* (add) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests/memcheck.supp
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests/CMakeLists.txt
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_mini_stress.c


> libhdfs++: Get minidfscluster tests running under valgrind
> --
>
> Key: HDFS-11807
> URL: https://issues.apache.org/jira/browse/HDFS-11807
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Anatoli Shein
>Priority: Major
> Attachments: HDFS-11807.HDFS-8707.000.patch, 
> HDFS-11807.HDFS-8707.001.patch, HDFS-11807.HDFS-8707.002.patch, 
> HDFS-11807.HDFS-8707.003.patch, HDFS-11807.HDFS-8707.004.patch, 
> HDFS-11807.HDFS-8707.005.patch, HDFS-11807.HDFS-8707.006.patch, 
> HDFS-11807.HDFS-8707.007.patch, HDFS-11807.HDFS-8707.008.patch, 
> HDFS-11807.HDFS-8707.009.patch
>
>
> The gmock based unit tests generally don't expose race conditions and memory 
> stomps.  A good way to expose these is running libhdfs++ stress tests and 
> tools under valgrind and pointing them at a real cluster.  Right now the CI 
> tools don't do that so bugs occasionally slip in and aren't caught until they 
> cause trouble in applications that use libhdfs++ for HDFS access.
> The reason the minidfscluster tests don't run under valgrind is because the 
> GC and JIT compiler in the embedded JVM do things that look like errors to 
> valgrind.  I'd like to have these tests do some basic setup and then fork 
> into two processes: one for the minidfscluster stuff and one for the 
> libhdfs++ client test.  A small amount of shared memory can be used to 
> provide a place for the minidfscluster to stick the hdfsBuilder object that 
> the client needs to get info about which port to connect to.  Can also stick 
> a condition variable there to let the minidfscluster know when it can shut 
> down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11807) libhdfs++: Get minidfscluster tests running under valgrind

2018-05-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-11807:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: Get minidfscluster tests running under valgrind
> --
>
> Key: HDFS-11807
> URL: https://issues.apache.org/jira/browse/HDFS-11807
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Anatoli Shein
>Priority: Major
> Attachments: HDFS-11807.HDFS-8707.000.patch, 
> HDFS-11807.HDFS-8707.001.patch, HDFS-11807.HDFS-8707.002.patch, 
> HDFS-11807.HDFS-8707.003.patch, HDFS-11807.HDFS-8707.004.patch, 
> HDFS-11807.HDFS-8707.005.patch, HDFS-11807.HDFS-8707.006.patch, 
> HDFS-11807.HDFS-8707.007.patch, HDFS-11807.HDFS-8707.008.patch, 
> HDFS-11807.HDFS-8707.009.patch
>
>
> The gmock based unit tests generally don't expose race conditions and memory 
> stomps.  A good way to expose these is running libhdfs++ stress tests and 
> tools under valgrind and pointing them at a real cluster.  Right now the CI 
> tools don't do that so bugs occasionally slip in and aren't caught until they 
> cause trouble in applications that use libhdfs++ for HDFS access.
> The reason the minidfscluster tests don't run under valgrind is because the 
> GC and JIT compiler in the embedded JVM do things that look like errors to 
> valgrind.  I'd like to have these tests do some basic setup and then fork 
> into two processes: one for the minidfscluster stuff and one for the 
> libhdfs++ client test.  A small amount of shared memory can be used to 
> provide a place for the minidfscluster to stick the hdfsBuilder object that 
> the client needs to get info about which port to connect to.  Can also stick 
> a condition variable there to let the minidfscluster know when it can shut 
> down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11807) libhdfs++: Get minidfscluster tests running under valgrind

2018-05-02 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461217#comment-16461217
 ] 

James Clampffer commented on HDFS-11807:


+1, I just committed this to trunk.  Sorry it took so long to get around to it.

Some tests I did:
-Build with gcc and clang in the dev-support docker container and ran tests
-Add memory leaks to make sure it caught them
-Add invalid reads and writes to make sure they were caught

> libhdfs++: Get minidfscluster tests running under valgrind
> --
>
> Key: HDFS-11807
> URL: https://issues.apache.org/jira/browse/HDFS-11807
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Anatoli Shein
>Priority: Major
> Attachments: HDFS-11807.HDFS-8707.000.patch, 
> HDFS-11807.HDFS-8707.001.patch, HDFS-11807.HDFS-8707.002.patch, 
> HDFS-11807.HDFS-8707.003.patch, HDFS-11807.HDFS-8707.004.patch, 
> HDFS-11807.HDFS-8707.005.patch, HDFS-11807.HDFS-8707.006.patch, 
> HDFS-11807.HDFS-8707.007.patch, HDFS-11807.HDFS-8707.008.patch, 
> HDFS-11807.HDFS-8707.009.patch
>
>
> The gmock based unit tests generally don't expose race conditions and memory 
> stomps.  A good way to expose these is running libhdfs++ stress tests and 
> tools under valgrind and pointing them at a real cluster.  Right now the CI 
> tools don't do that so bugs occasionally slip in and aren't caught until they 
> cause trouble in applications that use libhdfs++ for HDFS access.
> The reason the minidfscluster tests don't run under valgrind is because the 
> GC and JIT compiler in the embedded JVM do things that look like errors to 
> valgrind.  I'd like to have these tests do some basic setup and then fork 
> into two processes: one for the minidfscluster stuff and one for the 
> libhdfs++ client test.  A small amount of shared memory can be used to 
> provide a place for the minidfscluster to stick the hdfsBuilder object that 
> the client needs to get info about which port to connect to.  Can also stick 
> a condition variable there to let the minidfscluster know when it can shut 
> down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12981) HDFS renameSnapshot to Itself for Non Existent snapshot should throw error

2018-05-02 Thread Kitti Nanasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-12981:

Attachment: HDFS-12981-branch-2.6.0.001.patch

> HDFS  renameSnapshot to Itself for Non Existent snapshot should throw error
> ---
>
> Key: HDFS-12981
> URL: https://issues.apache.org/jira/browse/HDFS-12981
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Sailesh Patel
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-12981-branch-2.6.0.001.patch, HDFS-12981.001.patch, 
> HDFS-12981.002.patch
>
>
> When trying to rename a non-existent HDFS  snapshot to ITSELF, there are no 
> errors and exits with a success code.
> The steps to reproduce this issue is:
> hdfs dfs -mkdir /tmp/dir1
> hdfs dfsadmin -allowSnapshot /tmp/dir1
> hdfs dfs  -createSnapshot /tmp/dir1  snap1_dir
> Rename from non-existent to another_non-existent : errors and return code 1.  
> This is correct.
>   hdfs dfs -renameSnapshot /tmp/dir1 nonexist another_nonexist  : 
>   echo $?
>
>   renameSnapshot: The snapshot nonexist does not exist for directory /tmp/dir1
> Rename from non-existent to non-existent : no errors and return code 0  
> instead of Error and return code 1.
>   hdfs dfs -renameSnapshot /tmp/dir1 nonexist nonexist  ;  echo $?
> Current behavior:   No error and return code 0.
> Expected behavior:  An error returned and return code 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13398) Hdfs recursive listing operation is very slow

2018-05-02 Thread Ajay Sachdev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Sachdev updated HDFS-13398:

Attachment: HDFS-13398.001.patch

> Hdfs recursive listing operation is very slow
> -
>
> Key: HDFS-13398
> URL: https://issues.apache.org/jira/browse/HDFS-13398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.1
> Environment: HCFS file system where HDP 2.6.1 is connected to ECS 
> (Object Store).
>Reporter: Ajay Sachdev
>Assignee: Ajay Sachdev
>Priority: Major
> Fix For: 2.7.1
>
> Attachments: HDFS-13398.001.patch, parallelfsPatch
>
>
> The hdfs dfs -ls -R command is sequential in nature and is very slow for a 
> HCFS system. We have seen around 6 mins for 40K directory/files structure.
> The proposal is to use multithreading approach to speed up recursive list, du 
> and count operations.
> We have tried a ForkJoinPool implementation to improve performance for 
> recursive listing operation.
> [https://github.com/jasoncwik/hadoop-release/tree/parallel-fs-cli]
> commit id : 
> 82387c8cd76c2e2761bd7f651122f83d45ae8876
> Another implementation is to use Java Executor Service to improve performance 
> to run listing operation in multiple threads in parallel. This has 
> significantly reduced the time to 40 secs from 6 mins.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13398) Hdfs recursive listing operation is very slow

2018-05-02 Thread Ajay Sachdev (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461122#comment-16461122
 ] 

Ajay Sachdev commented on HDFS-13398:
-

Hi Mukul/Arpit,

Sorry for late response. We would like to get a formal review process for this 
patch before we deploy on customer site. I will work on items #3 and #4 above. 
Also I was unable to find trunk branch in 

[https://github.com/hortonworks/hadoop-release.] So we have used HDP tag -

HDP-2.6.2.0-205-tag

I have also attached the patch against this tag. I would appreciate if you 
could take a look at code and get a review as well.

 

Thanks

Ajay[^HDFS-13398.001.patch]

> Hdfs recursive listing operation is very slow
> -
>
> Key: HDFS-13398
> URL: https://issues.apache.org/jira/browse/HDFS-13398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.1
> Environment: HCFS file system where HDP 2.6.1 is connected to ECS 
> (Object Store).
>Reporter: Ajay Sachdev
>Assignee: Ajay Sachdev
>Priority: Major
> Fix For: 2.7.1
>
> Attachments: HDFS-13398.001.patch, parallelfsPatch
>
>
> The hdfs dfs -ls -R command is sequential in nature and is very slow for a 
> HCFS system. We have seen around 6 mins for 40K directory/files structure.
> The proposal is to use multithreading approach to speed up recursive list, du 
> and count operations.
> We have tried a ForkJoinPool implementation to improve performance for 
> recursive listing operation.
> [https://github.com/jasoncwik/hadoop-release/tree/parallel-fs-cli]
> commit id : 
> 82387c8cd76c2e2761bd7f651122f83d45ae8876
> Another implementation is to use Java Executor Service to improve performance 
> to run listing operation in multiple threads in parallel. This has 
> significantly reduced the time to 40 secs from 6 mins.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12981) HDFS renameSnapshot to Itself for Non Existent snapshot should throw error

2018-05-02 Thread Kitti Nanasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-12981:

Attachment: HDFS-12981.002.patch

> HDFS  renameSnapshot to Itself for Non Existent snapshot should throw error
> ---
>
> Key: HDFS-12981
> URL: https://issues.apache.org/jira/browse/HDFS-12981
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Sailesh Patel
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-12981.001.patch, HDFS-12981.002.patch
>
>
> When trying to rename a non-existent HDFS  snapshot to ITSELF, there are no 
> errors and exits with a success code.
> The steps to reproduce this issue is:
> hdfs dfs -mkdir /tmp/dir1
> hdfs dfsadmin -allowSnapshot /tmp/dir1
> hdfs dfs  -createSnapshot /tmp/dir1  snap1_dir
> Rename from non-existent to another_non-existent : errors and return code 1.  
> This is correct.
>   hdfs dfs -renameSnapshot /tmp/dir1 nonexist another_nonexist  : 
>   echo $?
>
>   renameSnapshot: The snapshot nonexist does not exist for directory /tmp/dir1
> Rename from non-existent to non-existent : no errors and return code 0  
> instead of Error and return code 1.
>   hdfs dfs -renameSnapshot /tmp/dir1 nonexist nonexist  ;  echo $?
> Current behavior:   No error and return code 0.
> Expected behavior:  An error returned and return code 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460971#comment-16460971
 ] 

genericqa commented on HDDS-16:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
10s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
10s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
11s{color} | {color:red} common in trunk failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
24s{color} | {color:red} container-service in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
45s{color} | {color:red} hadoop-hdds/common in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
32s{color} | {color:red} hadoop-ozone/tools in trunk has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 28m  0s{color} | 
{color:red} root generated 9 new + 2 unchanged - 0 fixed = 11 total (was 2) 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 28m  0s{color} 
| {color:red} root generated 1217 new + 260 unchanged - 0 fixed = 1477 total 
(was 260) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 50s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
8s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} 

[jira] [Commented] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460955#comment-16460955
 ] 

genericqa commented on HDFS-13443:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 16m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 59s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 
1 unchanged - 0 fixed = 3 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
19s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}215m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13443 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921557/HDFS-13443.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit 

[jira] [Commented] (HDFS-13174) hdfs mover -p /path times out after 20 min

2018-05-02 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460906#comment-16460906
 ] 

genericqa commented on HDFS-13174:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 48s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 2 new + 531 unchanged - 
0 fixed = 533 total (was 531) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 51s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 752 unchanged - 1 fixed = 757 total (was 753) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 52s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 31s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HDFS-13174 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921555/HDFS-13174.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 98292cd14a30 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e07156e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| javac | 

[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Attachment: HDDS-16.001.patch

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Status: Patch Available  (was: Open)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Attachment: (was: HDFS-12841-HDFS-7240.002.patch)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Attachment: (was: HDFS-12841-HDFS-7240.003.patch)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Attachment: (was: HDFS-12841-HDFS-7240.001.patch)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-16:
--
Attachment: (was: HDFS-12841-HDFS-7240.004.patch)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-16.001.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Moved] (HDDS-16) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDDS-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh moved HDFS-12841 to HDDS-16:
--

Fix Version/s: (was: HDFS-7240)
   0.2.1
Affects Version/s: (was: HDFS-7240)
   0.2.1
 Target Version/s:   (was: HDFS-7240)
  Component/s: (was: ozone)
   Ozone Datanode
   Native
 Workflow: patch-available, re-open possible  (was: 
no-reopen-closed, patch-avail)
  Key: HDDS-16  (was: HDFS-12841)
  Project: Hadoop Distributed Data Store  (was: Hadoop HDFS)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDDS-16
> URL: https://issues.apache.org/jira/browse/HDDS-16
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Native, Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDFS-12841-HDFS-7240.001.patch, 
> HDFS-12841-HDFS-7240.002.patch, HDFS-12841-HDFS-7240.003.patch, 
> HDFS-12841-HDFS-7240.004.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12841) Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.

2018-05-02 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12841:
-
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-7240)

> Ozone: Remove Pipeline from Datanode Container Protocol protobuf definition.
> 
>
> Key: HDFS-12841
> URL: https://issues.apache.org/jira/browse/HDFS-12841
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12841-HDFS-7240.001.patch, 
> HDFS-12841-HDFS-7240.002.patch, HDFS-12841-HDFS-7240.003.patch, 
> HDFS-12841-HDFS-7240.004.patch
>
>
> The current Ozone code passes pipeline information to datanodes as well. 
> However datanodes do not use this information.
> Hence Pipeline should be removed from ozone datanode commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13245) RBF: State store DBMS implementation

2018-05-02 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460792#comment-16460792
 ] 

Yiqun Lin commented on HDFS-13245:
--

Some review comments from me:

* I see {{TestStateStoreDisabledNameserviceStore}} is missing in current UT. We 
can add this in another JIRA or just add this here.
* {{ENCODE_UTF8 = "UTF-8";}} can be replaced by 
{{StandardCharsets.UTF_8.name();}}.
* {{testMetrics}} can be included in {{TestStateStore*SQLDB}}.

> RBF: State store DBMS implementation
> 
>
> Key: HDFS-13245
> URL: https://issues.apache.org/jira/browse/HDFS-13245
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: maobaolong
>Assignee: Yiran Wu
>Priority: Major
> Attachments: HDFS-13245.001.patch, HDFS-13245.002.patch, 
> HDFS-13245.003.patch, HDFS-13245.004.patch, HDFS-13245.005.patch, 
> HDFS-13245.006.patch
>
>
> Add a DBMS implementation for the State Store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-13369) FSCK Report broken with RequestHedgingProxyProvider

2018-05-02 Thread Ranith Sardar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar updated HDFS-13369:
-
Comment: was deleted

(was: h4. Hi [~eddyxu] , [~andrew.wang] . Can you please review this patch?

It is related to RequestHedgingProxyProvider and client.)

> FSCK Report broken with RequestHedgingProxyProvider 
> 
>
> Key: HDFS-13369
> URL: https://issues.apache.org/jira/browse/HDFS-13369
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.3
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-13369.001.patch, HDFS-13369.002.patch, 
> HDFS-13369.003.patch, HDFS-13369.004.patch
>
>
> Scenario:-
> 1.Configure the RequestHedgingProxy
> 2. write some files in file system
> 3. Take FSCK report for the above files
>  
> {noformat}
> bin> hdfs fsck /file1 -locations -files -blocks
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler
>  cannot be cast to org.apache.hadoop.ipc.RpcInvocationHandler
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:626)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.getConnectionId(RetryInvocationHandler.java:438)
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:628)
> at org.apache.hadoop.ipc.RPC.getServerAddress(RPC.java:611)
> at org.apache.hadoop.hdfs.HAUtil.getAddressOfActive(HAUtil.java:263)
> at 
> org.apache.hadoop.hdfs.tools.DFSck.getCurrentNamenodeAddress(DFSck.java:257)
> at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:319)
> at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:156)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:153)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:152)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:385){noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11934) Add assertion to TestDefaultNameNodePort#testGetAddressFromConf

2018-05-02 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460726#comment-16460726
 ] 

Akira Ajisaka commented on HDFS-11934:
--

Thanks [~legend-hua] for the patch.
The change looks good to me. Would you reverse the order of the arguments of 
assertEquals(expected, actual) in the class? I'm +1 if that is addressed.

> Add assertion to TestDefaultNameNodePort#testGetAddressFromConf
> ---
>
> Key: HDFS-11934
> URL: https://issues.apache.org/jira/browse/HDFS-11934
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha4
>Reporter: legend
>Priority: Minor
> Attachments: HDFS-11934.patch
>
>
> Add an additional assertion to TestDefaultNameNodePort, verify that 
> testGetAddressFromConf returns 555 if setDefaultUri(conf, "foo:555").



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread Mohammad Arshad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Arshad updated HDFS-13443:
---
Attachment: HDFS-13443.006.patch

> RBF: Update mount table cache immediately after changing (add/update/remove) 
> mount table entries.
> -
>
> Key: HDFS-13443
> URL: https://issues.apache.org/jira/browse/HDFS-13443
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13443-branch-2.001.patch, 
> HDFS-13443-branch-2.002.patch, HDFS-13443.001.patch, HDFS-13443.002.patch, 
> HDFS-13443.003.patch, HDFS-13443.004.patch, HDFS-13443.005.patch, 
> HDFS-13443.006.patch
>
>
> Currently mount table cache is updated periodically, by default cache is 
> updated every minute. After change in mount table, user operations may still 
> use old mount table. This is bit wrong.
> To update mount table cache, maybe we can do following
>  * *Add refresh API in MountTableManager which will update mount table cache.*
>  * *When there is a change in mount table entries, router admin server can 
> update its cache and ask other routers to update their cache*. For example if 
> there are three routers R1,R2,R3 in a cluster then add mount table entry API, 
> at admin server side, will perform following sequence of action
>  ## user submit add mount table entry request on R1
>  ## R1 adds the mount table entry in state store
>  ## R1 call refresh API on R2
>  ## R1 calls refresh API on R3
>  ## R1 directly freshest its cache
>  ## Add mount table entry response send back to user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13443) RBF: Update mount table cache immediately after changing (add/update/remove) mount table entries.

2018-05-02 Thread Mohammad Arshad (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460723#comment-16460723
 ] 

Mohammad Arshad commented on HDFS-13443:


Addressed all the comments, except above two comments, submitted new patch 
HDFS-13443.006.patch.
bq. I would probably call it TestRemoteRouterMountTableRefresh or similar; no 
need for the ZK suffix either
changed to TestRouterMountTableCacheRefresh
bq. For stopping one Router, just pick one, no need to go through the list.
Need to go through the list to pick the router which is not selected for other 
admin operations

> RBF: Update mount table cache immediately after changing (add/update/remove) 
> mount table entries.
> -
>
> Key: HDFS-13443
> URL: https://issues.apache.org/jira/browse/HDFS-13443
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13443-branch-2.001.patch, 
> HDFS-13443-branch-2.002.patch, HDFS-13443.001.patch, HDFS-13443.002.patch, 
> HDFS-13443.003.patch, HDFS-13443.004.patch, HDFS-13443.005.patch
>
>
> Currently mount table cache is updated periodically, by default cache is 
> updated every minute. After change in mount table, user operations may still 
> use old mount table. This is bit wrong.
> To update mount table cache, maybe we can do following
>  * *Add refresh API in MountTableManager which will update mount table cache.*
>  * *When there is a change in mount table entries, router admin server can 
> update its cache and ask other routers to update their cache*. For example if 
> there are three routers R1,R2,R3 in a cluster then add mount table entry API, 
> at admin server side, will perform following sequence of action
>  ## user submit add mount table entry request on R1
>  ## R1 adds the mount table entry in state store
>  ## R1 call refresh API on R2
>  ## R1 calls refresh API on R3
>  ## R1 directly freshest its cache
>  ## Add mount table entry response send back to user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13174) hdfs mover -p /path times out after 20 min

2018-05-02 Thread Istvan Fajth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-13174:

Release Note: 
Mover could have fail after 20+ minutes if a block move was enqueued for this 
long, between two DataNodes due to an internal constant that was introduced for 
Balancer, but affected Mover as well.
The internal constant can be configured with the 
dfs.balancer.max-iteration-time parameter after the patch, and affects only the 
Balancer. Default is 20 minutes.
  Status: Patch Available  (was: Open)

> hdfs mover -p /path times out after 20 min
> --
>
> Key: HDFS-13174
> URL: https://issues.apache.org/jira/browse/HDFS-13174
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.0.0-alpha2, 2.7.4, 2.8.0
>Reporter: Istvan Fajth
>Assignee: Istvan Fajth
>Priority: Major
> Attachments: HDFS-13174.001.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
> class, that is checked during dispatching the moves that the Balancer and the 
> Mover does. This timeout is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out 
> the Balancer runs further and does an other iteration before it fails if 
> there were no moves happened in a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path 
> runs for more than 20 minutes, and there are moves decided and enqueued 
> between two DataNode, after 20 minutes Mover will stop with the following 
> exception reported to the console (lines might differ as this exception came 
> from a CDH5.12.1 installation).
>  java.io.IOException: Block move timed out
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
> Note that this issue is not coming up if all blocks can be moved inside the 
> DataNodes without having to move the block to an other DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13174) hdfs mover -p /path times out after 20 min

2018-05-02 Thread Istvan Fajth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-13174:

Description: 
In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
class, that is checked during dispatching the moves that the Balancer and the 
Mover does. This timeout is hardwired to 20 minutes.

In the Balancer we have iterations, and even if an iteration is timing out the 
Balancer runs further and does an other iteration before it fails if there were 
no moves happened in a few iterations.

The Mover on the other hand does not have iterations, so if moving a path runs 
for more than 20 minutes, and there are moves decided and enqueued between two 
DataNode, after 20 minutes Mover will stop with the following exception 
reported to the console (lines might differ as this exception came from a 
CDH5.12.1 installation).
 java.io.IOException: Block move timed out
 at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
 at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
 at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
 at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

 

Note that this issue is not coming up if all blocks can be moved inside the 
DataNodes without having to move the block to an other DataNode.

  was:
In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
class, that is checked during dispatching the moves that the Balancer and the 
Mover does. This timeout is hardwired to 20 minutes.

In the Balancer we have iterations, and even if an iteration is timing out the 
Balancer runs further and does an other iteration before it fails if there were 
no moves happened in a few iterations.

The Mover on the other hand does not have iterations, so if moving a path runs 
for more than 20 minutes, after 20 minutes Mover will stop with the following 
exception reported to the console (lines might differ as this exception came 
from a CDH5.12.1 installation):
java.io.IOException: Block move timed out
at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


> hdfs mover -p /path times out after 20 min
> --
>
> Key: HDFS-13174
> URL: https://issues.apache.org/jira/browse/HDFS-13174
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha2
>Reporter: Istvan Fajth
>Assignee: Istvan Fajth
>Priority: Major
> Attachments: HDFS-13174.001.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
> class, that is checked during dispatching the moves that the Balancer and the 
> Mover does. This timeout is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out 
> the Balancer runs further and does an other iteration before it fails if 
> there were no moves happened in a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path 
> runs for more than 20 minutes, and there are moves decided and enqueued 
> between two DataNode, after 20 minutes Mover will stop with the following 
> exception reported to the console (lines might differ as this exception came 
> from a CDH5.12.1 installation).
>  java.io.IOException: Block move timed out
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
>  at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at 

[jira] [Commented] (HDFS-13174) hdfs mover -p /path times out after 20 min

2018-05-02 Thread Istvan Fajth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460711#comment-16460711
 ] 

Istvan Fajth commented on HDFS-13174:
-

Attaching a patch for review.

The patch contains some refactoring to make the iteration time configurable. I 
have added a configuration for the Balancer to control the maximum iteration 
time, it seemed reasonable, however that might not need to be exposed, in this 
initial patch I have exposed it.

Added a test for Balancer to test the max iteration time is respected, in the 
test to make it run in a reasonable timeframe with reasonable amount of 
resources used, I had to use the deprecated 
DFSConfigKeys.DFS_CLIENT_SOCKET_TIMEOUT_KEY, I am not sure but if there are any 
better way to control how often the DN gets back to the client to keepalive the 
connection, I would be glad to know that, this was the only way to affect that, 
and the newly introduced HdfsClientConfigKeys.DFS_CLIENT_SOCKET_TIMEOUT_KEY is 
not visible in the test package, and I did not find a way to tune the same in 
the DN.

Added a test for Balancer, if in Dispatcher you set the newly added constructor 
parameter to a value higher than 0 like for example 200L the test fails because 
no blocks were moved as the block moves were timed out, this was the case with 
the previous constant.

Updating the Jira description as well as I learned a few things about the issue.

> hdfs mover -p /path times out after 20 min
> --
>
> Key: HDFS-13174
> URL: https://issues.apache.org/jira/browse/HDFS-13174
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha2
>Reporter: Istvan Fajth
>Assignee: Istvan Fajth
>Priority: Major
> Attachments: HDFS-13174.001.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
> class, that is checked during dispatching the moves that the Balancer and the 
> Mover does. This timeout is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out 
> the Balancer runs further and does an other iteration before it fails if 
> there were no moves happened in a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path 
> runs for more than 20 minutes, after 20 minutes Mover will stop with the 
> following exception reported to the console (lines might differ as this 
> exception came from a CDH5.12.1 installation):
> java.io.IOException: Block move timed out
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13174) hdfs mover -p /path times out after 20 min

2018-05-02 Thread Istvan Fajth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-13174:

Attachment: HDFS-13174.001.patch

> hdfs mover -p /path times out after 20 min
> --
>
> Key: HDFS-13174
> URL: https://issues.apache.org/jira/browse/HDFS-13174
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha2
>Reporter: Istvan Fajth
>Assignee: Istvan Fajth
>Priority: Major
> Attachments: HDFS-13174.001.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source 
> class, that is checked during dispatching the moves that the Balancer and the 
> Mover does. This timeout is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out 
> the Balancer runs further and does an other iteration before it fails if 
> there were no moves happened in a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path 
> runs for more than 20 minutes, after 20 minutes Mover will stop with the 
> following exception reported to the console (lines might differ as this 
> exception came from a CDH5.12.1 installation):
> java.io.IOException: Block move timed out
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2018-05-02 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460693#comment-16460693
 ] 

Yiqun Lin commented on HDFS-13507:
--

One comment: Can we add the update command usage in 
{{HDFSRouterFederation.md}}? This is an incompatible change, we'd be better to 
complete related document.
Others look good to me.

> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible
> Attachments: HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2018-05-02 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13507:
-
Labels: incompatible  (was: )

> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible
> Attachments: HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13488) RBF: Reject requests when a Router is overloaded

2018-05-02 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13488:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> RBF: Reject requests when a Router is overloaded
> 
>
> Key: HDFS-13488
> URL: https://issues.apache.org/jira/browse/HDFS-13488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.1
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: HDFS-13488.000.patch, HDFS-13488.001.patch, 
> HDFS-13488.002.patch, HDFS-13488.003.patch, HDFS-13488.004.patch
>
>
> A Router might be overloaded when handling special cases (e.g. a slow 
> subcluster). The Router could reject the requests and the client could try 
> with another Router. We should leverage the Standby mechanism for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13488) RBF: Reject requests when a Router is overloaded

2018-05-02 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13488:
-
Affects Version/s: 3.0.1
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.0, 3.1.1
Fix Version/s: 3.0.4
   2.9.2
   3.1.1
   3.2.0
   2.10.0

Committed this to trunk, branch-3.1. branch-3.0 and branch-2 and branch-2.9.
Thanks [~elgoiri] for the contribution.

> RBF: Reject requests when a Router is overloaded
> 
>
> Key: HDFS-13488
> URL: https://issues.apache.org/jira/browse/HDFS-13488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.1
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: HDFS-13488.000.patch, HDFS-13488.001.patch, 
> HDFS-13488.002.patch, HDFS-13488.003.patch, HDFS-13488.004.patch
>
>
> A Router might be overloaded when handling special cases (e.g. a slow 
> subcluster). The Router could reject the requests and the client could try 
> with another Router. We should leverage the Standby mechanism for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13488) RBF: Reject requests when a Router is overloaded

2018-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460618#comment-16460618
 ] 

Hudson commented on HDFS-13488:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14102 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14102/])
HDFS-13488. RBF: Reject requests when a Router is overloaded. (yqlin: rev 
37269261d1232bc71708f30c76193188258ef4bd)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/FederationTestUtils.java
* (delete) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterSafeModeException.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/metrics/FederationRPCPerformanceMonitor.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRPCClientRetries.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcMonitor.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/metrics/FederationRPCMetrics.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/StateStoreDFSCluster.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterSafemode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RBFConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/resources/hdfs-rbf-default.xml
* (add) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterClientRejectOverload.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/metrics/FederationRPCMBean.java


> RBF: Reject requests when a Router is overloaded
> 
>
> Key: HDFS-13488
> URL: https://issues.apache.org/jira/browse/HDFS-13488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13488.000.patch, HDFS-13488.001.patch, 
> HDFS-13488.002.patch, HDFS-13488.003.patch, HDFS-13488.004.patch
>
>
> A Router might be overloaded when handling special cases (e.g. a slow 
> subcluster). The Router could reject the requests and the client could try 
> with another Router. We should leverage the Standby mechanism for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13488) RBF: Reject requests when a Router is overloaded

2018-05-02 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460587#comment-16460587
 ] 

Yiqun Lin commented on HDFS-13488:
--

LGTM, +1. Committing.

> RBF: Reject requests when a Router is overloaded
> 
>
> Key: HDFS-13488
> URL: https://issues.apache.org/jira/browse/HDFS-13488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13488.000.patch, HDFS-13488.001.patch, 
> HDFS-13488.002.patch, HDFS-13488.003.patch, HDFS-13488.004.patch
>
>
> A Router might be overloaded when handling special cases (e.g. a slow 
> subcluster). The Router could reject the requests and the client could try 
> with another Router. We should leverage the Standby mechanism for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org