[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"

2019-04-30 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830266#comment-16830266
 ] 

Kitti Nanasi commented on HDFS-13933:
-

I was not correct in my previous comment, looking into it a bit more, these 
tests fail because of a "javax.net.ssl.SSLPeerUnverifiedException: peer not 
authenticated" exception from where sslSession.getPeerCertificates() is invoked 
(it is used in 3 different places in our code).

I think it is because of the following bug in OpenJDK:

[https://bugs.openjdk.java.net/browse/JDK-8212885]

[https://bugs.openjdk.java.net/browse/JDK-8220723]

The issue affects OpenJDK 11.0.2 and it seems like it was backported to OpenJDK 
11.0.3 and OpenJDK 12.0.1. I verified that these tests pass with OpenJDK 12.0.1.

 

> [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification 
> problems for "localhost"
> --
>
> Key: HDFS-13933
> URL: https://issues.apache.org/jira/browse/HDFS-13933
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Priority: Minor
>
> Tests with issues:
> * TestHttpFSFWithSWebhdfsFileSystem
> * TestWebHdfsTokens
> * TestSWebHdfsFileContextMainOperations
> Possibly others. Failure looks like 
> {noformat}
> java.io.IOException: localhost:50260: HTTPS hostname wrong:  should be 
> 
> {noformat}
> These tests set up a trust store and use HTTPS connections, and with Java 11 
> the client validation of the server name in the generated self-signed 
> certificate is failing. Exceptions originate in the JRE's HTTP client 
> library. How everything hooks together uses static initializers, static 
> methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. 
> This is Java 11+28



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"

2019-04-26 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826978#comment-16826978
 ] 

Kitti Nanasi commented on HDFS-13933:
-

The affected tests all use HttpsURLConnection and HttpURLConnection classes 
that have a better alternative in JDK 11. We might need to use the new 
HttpClient instead. But let's see if we can fix the current implementation 
first.

Related article: 
[https://dzone.com/articles/java-11-standardized-http-client-api]

> [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification 
> problems for "localhost"
> --
>
> Key: HDFS-13933
> URL: https://issues.apache.org/jira/browse/HDFS-13933
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Priority: Minor
>
> Tests with issues:
> * TestHttpFSFWithSWebhdfsFileSystem
> * TestWebHdfsTokens
> * TestSWebHdfsFileContextMainOperations
> Possibly others. Failure looks like 
> {noformat}
> java.io.IOException: localhost:50260: HTTPS hostname wrong:  should be 
> 
> {noformat}
> These tests set up a trust store and use HTTPS connections, and with Java 11 
> the client validation of the server name in the generated self-signed 
> certificate is failing. Exceptions originate in the JRE's HTTP client 
> library. How everything hooks together uses static initializers, static 
> methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. 
> This is Java 11+28



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"

2019-04-26 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826866#comment-16826866
 ] 

Kitti Nanasi commented on HDFS-13933:
-

Thanks [~apurtell] for reporting this issue and [~smeng] for the further 
details! It seems like all three tests fail with OpenJDK 11, but they succeed 
with Zulu JDK 11.

> [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification 
> problems for "localhost"
> --
>
> Key: HDFS-13933
> URL: https://issues.apache.org/jira/browse/HDFS-13933
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Priority: Minor
>
> Tests with issues:
> * TestHttpFSFWithSWebhdfsFileSystem
> * TestWebHdfsTokens
> * TestSWebHdfsFileContextMainOperations
> Possibly others. Failure looks like 
> {noformat}
> java.io.IOException: localhost:50260: HTTPS hostname wrong:  should be 
> 
> {noformat}
> These tests set up a trust store and use HTTPS connections, and with Java 11 
> the client validation of the server name in the generated self-signed 
> certificate is failing. Exceptions originate in the JRE's HTTP client 
> library. How everything hooks together uses static initializers, static 
> methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. 
> This is Java 11+28



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-15 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817916#comment-16817916
 ] 

Kitti Nanasi commented on HDDS-1192:


[~elek], could you review this?

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, 
> HDDS-1192.003.patch, HDDS-1192.004.patch, HDDS-1192.005.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-15 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Attachment: HDDS-1192.005.patch

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, 
> HDDS-1192.003.patch, HDDS-1192.004.patch, HDDS-1192.005.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.

2019-04-15 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817679#comment-16817679
 ] 

Kitti Nanasi commented on HDFS-14353:
-

Thanks [~maobaolong] for reporting this issue and providing a fix!

Could you add a unit test? I think TestReconstructStripedFile could be extended 
with this check.

> Erasure Coding: metrics xmitsInProgress become to negative.
> ---
>
> Key: HDFS-14353
> URL: https://issues.apache.org/jira/browse/HDFS-14353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, erasure-coding
>Affects Versions: 3.3.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14353.001.patch, screenshot-1.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-12 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Attachment: HDDS-1192.004.patch

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, 
> HDDS-1192.003.patch, HDDS-1192.004.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-12 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Attachment: HDDS-1192.003.patch

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, 
> HDDS-1192.003.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-11 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815660#comment-16815660
 ] 

Kitti Nanasi commented on HDDS-1192:


I fixed the findbugs issue in patch v002. The test failures are not related.

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-11 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Attachment: HDDS-1192.002.patch

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14060) HDFS fetchdt command to return error codes on success/failure

2019-04-11 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi reassigned HDFS-14060:
---

Assignee: (was: Kitti Nanasi)

> HDFS fetchdt command to return error codes on success/failure
> -
>
> Key: HDFS-14060
> URL: https://issues.apache.org/jira/browse/HDFS-14060
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Major
>
> The {{hdfs fetchdt}} command always returns 0, even when there's been an 
> error (no token issued, no file to load, usage, etc). This means its not that 
> useful as a command line tool for testing or in scripts.
> Proposed: exit non-zero for errors; reuse LaucherExitCodes for these



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14115) TestNamenodeCapacityReport#testXceiverCount is flaky

2019-04-11 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi reassigned HDFS-14115:
---

Assignee: (was: Kitti Nanasi)

> TestNamenodeCapacityReport#testXceiverCount is flaky
> 
>
> Key: HDFS-14115
> URL: https://issues.apache.org/jira/browse/HDFS-14115
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Priority: Major
>
> TestNamenodeCapacityReport#testXceiverCount sometimes fails with the 
> following error:
> {code}
> 2018-11-28 17:33:45,816 INFO  DataNode - PacketResponder: 
> BP-645736292-172.17.0.2-1543426416580:blk_1073741828_1004, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:37115, 
> 127.0.0.1:35107] terminating
> 2018-11-28 17:33:45,817 INFO  StateChange - DIR* completeFile: /f3 is closed 
> by DFSClient_NONMAPREDUCE_1933849415_1
> 2018-11-28 17:33:45,817 INFO  ExitUtil - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError: Negative 
> replicas!
> 2018-11-28 17:33:45,818 ERROR ExitUtil - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError: Negative replicas!
>   at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError: Negative replicas!
>   at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-09 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813412#comment-16813412
 ] 

Kitti Nanasi commented on HDDS-1192:


Patch v001 contains the following modifications, I did some additional 
refactoring for the unit tests to pass:
 * GenericCli now accepts -conf argument
 * Usage field is removed from MissingSubcommandException, because now it 
derives from CommandLine.ParameterException, which means that when running the 
commands with the default exception handler (the default exception handler is 
used everywhere except in the tests), picocli will print the usage.
 * The startup message is written inside the ozone 'datanode command' and not 
in the constructor, so if the command is invalid, it won't print the startup 
message.
 * I modified that if 'ozone datanode' command is ran by itself, it will not 
throw an invalid command exception, it will only fail if it is used with an 
invalid argument (like 'ozone datanode -invalidArg'). Let me know if that is 
not ok.

Note that the subcommands, like volume and bucket, are not deriving from 
GenericCli, so the -conf parameter has to come before those commands like this: 
'ozone sh -conf conf volume...'

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-09 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Status: Patch Available  (was: Open)

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1192.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli

2019-04-09 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1192:
---
Attachment: HDDS-1192.001.patch

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-1192.001.patch
>
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO

2019-04-02 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14404:

Status: Patch Available  (was: Open)

> Reduce KMS error logging severity from WARN to INFO
> ---
>
> Key: HDFS-14404
> URL: https://issues.apache.org/jira/browse/HDFS-14404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 3.2.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14404.001.patch
>
>
> When the KMS is deployed as an HA service and a failure occurs the current 
> error severity in the client code appears to be WARN. It can result in 
> excessive errors despite the fact that another instance may succeed.
> Maybe this log level can be adjusted in only the load balancing provider.
> {code}
> 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException 
> [java.net.ConnectException: Connection refused (Connection refused)]!!
> 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException:
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO

2019-04-02 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14404:

Attachment: HDFS-14404.001.patch

> Reduce KMS error logging severity from WARN to INFO
> ---
>
> Key: HDFS-14404
> URL: https://issues.apache.org/jira/browse/HDFS-14404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 3.2.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14404.001.patch
>
>
> When the KMS is deployed as an HA service and a failure occurs the current 
> error severity in the client code appears to be WARN. It can result in 
> excessive errors despite the fact that another instance may succeed.
> Maybe this log level can be adjusted in only the load balancing provider.
> {code}
> 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException 
> [java.net.ConnectException: Connection refused (Connection refused)]!!
> 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException:
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO

2019-04-02 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14404:
---

 Summary: Reduce KMS error logging severity from WARN to INFO
 Key: HDFS-14404
 URL: https://issues.apache.org/jira/browse/HDFS-14404
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: kms
Affects Versions: 3.2.0
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


When the KMS is deployed as an HA service and a failure occurs the current 
error severity in the client code appears to be WARN. It can result in 
excessive errors despite the fact that another instance may succeed.

Maybe this log level can be adjusted in only the load balancing provider.

{code}
19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
[https://example.com:16000/kms/v1/] threw an IOException 
[java.net.ConnectException: Connection refused (Connection refused)]!!

19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
[https://example.com:16000/kms/v1/] threw an IOException:
java.io.IOException: java.lang.reflect.UndeclaredThrowableException
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1153) Make tracing instrumentation configurable

2019-04-01 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1153:
---
Attachment: HDDS-1153.001.patch

> Make tracing instrumentation configurable
> -
>
> Key: HDDS-1153
> URL: https://issues.apache.org/jira/browse/HDDS-1153
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1153.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TracingUtil.createProxy is a helper method to create a proxy instance with 
> tracing support.
> The proxy instance implements the same interface as the original class and 
> delegates all the method calls to the original instance but it also sends 
> tracing information to the tracing server.
> By default it's not a big overhead as the tracing libraries can be configured 
> to send tracing only with some low probability.
> But to make it more safe we can make it optional. With a global 
> 'hdds.tracing.enabled' configuration variable (can be true by default) we can 
> adjust the behavior of TracingUtil.createProxy.
> If the tracing is disabled the TracingUtil.createProxy should return with the 
> 'delegate' parameter instead of a proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1153) Make tracing instrumentation configurable

2019-04-01 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDDS-1153:
---
Status: Patch Available  (was: Open)

> Make tracing instrumentation configurable
> -
>
> Key: HDDS-1153
> URL: https://issues.apache.org/jira/browse/HDDS-1153
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HDDS-1153.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TracingUtil.createProxy is a helper method to create a proxy instance with 
> tracing support.
> The proxy instance implements the same interface as the original class and 
> delegates all the method calls to the original instance but it also sends 
> tracing information to the tracing server.
> By default it's not a big overhead as the tracing libraries can be configured 
> to send tracing only with some low probability.
> But to make it more safe we can make it optional. With a global 
> 'hdds.tracing.enabled' configuration variable (can be true by default) we can 
> adjust the behavior of TracingUtil.createProxy.
> If the tracing is disabled the TracingUtil.createProxy should return with the 
> 'delegate' parameter instead of a proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1153) Make tracing instrumentation configurable

2019-04-01 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806684#comment-16806684
 ] 

Kitti Nanasi commented on HDDS-1153:


I think that's a good idea, [~jojochuang], I created HDDS-1364 for it.

> Make tracing instrumentation configurable
> -
>
> Key: HDDS-1153
> URL: https://issues.apache.org/jira/browse/HDDS-1153
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie
>
> TracingUtil.createProxy is a helper method to create a proxy instance with 
> tracing support.
> The proxy instance implements the same interface as the original class and 
> delegates all the method calls to the original instance but it also sends 
> tracing information to the tracing server.
> By default it's not a big overhead as the tracing libraries can be configured 
> to send tracing only with some low probability.
> But to make it more safe we can make it optional. With a global 
> 'hdds.tracing.enabled' configuration variable (can be true by default) we can 
> adjust the behavior of TracingUtil.createProxy.
> If the tracing is disabled the TracingUtil.createProxy should return with the 
> 'delegate' parameter instead of a proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-1364) Make OpenTracing implementation configurable

2019-04-01 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDDS-1364:
--

 Summary: Make OpenTracing implementation configurable
 Key: HDDS-1364
 URL: https://issues.apache.org/jira/browse/HDDS-1364
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Kitti Nanasi


This issue is coming from 
[HDDS-1153|https://issues.apache.org/jira/browse/HDDS-1153]. Currently Jaeger 
OpenTracing implementation is used. It would be nice if we could configure the 
OpenTracing implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1192) Support -conf command line argument in GenericCli

2019-03-11 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi reassigned HDDS-1192:
--

Assignee: Kitti Nanasi

> Support -conf command line argument in GenericCli
> -
>
> Key: HDDS-1192
> URL: https://issues.apache.org/jira/browse/HDDS-1192
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: newbie
>
> org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone 
> related command line application. It supports to define custom configuration 
> variables (-D or --set) but doesn't support the '--conf ozone-site.xml' 
> argument to load an external xml file to the configuration.
> Configuration and OzoneConfiguration classes load the ozone-site.xml from the 
> classpath. But it makes very hard to start Ozone components in IDE as we 
> can't modify the classpath easily. 
> One option here is to support the --conf everywhere to make it possible to 
> start ozone cluster in the IDE. 
> Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit 
> at anytime to 0.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14308) DFSStripedInputStream should implement unbuffer()

2019-02-26 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777837#comment-16777837
 ] 

Kitti Nanasi commented on HDFS-14308:
-

Thanks [~joemcdonnell] for reporting the issue!

DFSStripedInputStream does implement unbuffer as it derives from DFSInputStream 
which calls closeCurrentBlockReaders(); and that method clears all the block 
readers in DFSStripedInputStream, so that should be fine.

I think the issue might be that either CryptoInputStream or HdfsDataInputStream 
does not call the correct unbuffer method. But I have to check that to be sure.

> DFSStripedInputStream should implement unbuffer()
> -
>
> Key: HDFS-14308
> URL: https://issues.apache.org/jira/browse/HDFS-14308
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Joe McDonnell
>Priority: Major
> Attachments: ec_heap_dump.png
>
>
> Some users of HDFS cache opened HDFS file handles to avoid repeated 
> roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file 
> handles by default. Recent tests on erasure coded files show that the open 
> file handles can consume a large amount of memory when not in use.
> For example, here is output from Impala's JMX endpoint when 608 file handles 
> are cached
> {noformat}
> {
> "name": "java.nio:type=BufferPool,name=direct",
> "modelerType": "sun.management.ManagementFactoryHelper$1",
> "Name": "direct",
> "TotalCapacity": 1921048960,
> "MemoryUsed": 1921048961,
> "Count": 633,
> "ObjectName": "java.nio:type=BufferPool,name=direct"
> },{noformat}
> This shows direct buffer memory usage of 3MB per DFSStripedInputStream. 
> Attached is output from Eclipse MAT showing that the direct buffers come from 
> DFSStripedInputStream objects.
> To support caching file handles on erasure coded files, DFSStripedInputStream 
> should implement the unbuffer() call. See HDFS-7694. "unbuffer()" is intended 
> to move an input stream to a lower memory state to support these caching use 
> cases. Both Impala and HBase call unbuffer() when a file handle is being 
> cached and potentially unused for significant chunks of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-21 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774294#comment-16774294
 ] 

Kitti Nanasi commented on HDFS-14298:
-

Thanks [~surendrasingh] for the comment!

I refactored the test in patch v004 not to use hardcoded policies.

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch, HDFS-14298.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-21 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14298:

Attachment: HDFS-14298.004.patch

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch, HDFS-14298.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-20 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773156#comment-16773156
 ] 

Kitti Nanasi commented on HDFS-14298:
-

Added patch v003 to fix the tests introduced by the latest trunk in TestECAdmin.

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-20 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14298:

Attachment: HDFS-14298.003.patch

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-20 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772985#comment-16772985
 ] 

Kitti Nanasi commented on HDFS-14298:
-

Thanks for the review [~shwetayakkali]!

TestNameNodeMXBean failed because of the patch v001, so I fixed that in patch 
v002. The other test failures are not related.

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-20 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14298:

Attachment: HDFS-14298.002.patch

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-19 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14298:

Status: Patch Available  (was: Open)

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-19 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14298:
---

 Summary: Improve log messages of ECTopologyVerifier
 Key: HDFS-14298
 URL: https://issues.apache.org/jira/browse/HDFS-14298
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-19 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14298:

Attachment: HDFS-14298.001.patch

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-18 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771214#comment-16771214
 ] 

Kitti Nanasi commented on HDFS-14188:
-

Thanks for the comment [~jojochuang]!

Yes, that was a typo, I corrected it in patch v005.

You are right, the usage in case of multiple policies is very confusing, I 
corrected the implementation, so you don't need quotes anymore. It should work 
like this in the new patch:
{code:java}
-verifyClusterSetup -policy RS-3-2-1024k RS-10-4-1024
{code}

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-18 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.005.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-18 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: (was: HDFS-14188.005.patch)

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-18 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.005.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-14 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768128#comment-16768128
 ] 

Kitti Nanasi commented on HDFS-14188:
-

Thanks for the comment [~shwetayakkali]!

I fixed the checkstyle issues in patch v004 and merged the tests into one.

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-14 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.004.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch, HDFS-14188.004.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14125) Use parameterized log format in ECTopologyVerifier

2019-02-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767354#comment-16767354
 ] 

Kitti Nanasi commented on HDFS-14125:
-

Thanks [~shwetayakkali] for the review and [~jojochuang] for reviewing and 
committing!

> Use parameterized log format in ECTopologyVerifier
> --
>
> Key: HDFS-14125
> URL: https://issues.apache.org/jira/browse/HDFS-14125
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.3.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: HDFS-14125.001.patch
>
>
> ECTopologyVerifier introduced in 
> [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a 
> parameterized log format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error

2019-02-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767338#comment-16767338
 ] 

Kitti Nanasi commented on HDFS-14231:
-

Thanks [~jojochuang] for reviewing and committing!

> DataXceiver#run() should not log exceptions caused by InvalidToken exception 
> as an error
> 
>
> Key: HDFS-14231
> URL: https://issues.apache.org/jira/browse/HDFS-14231
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14231.001.patch
>
>
> HDFS-10760 changed the log level from error to trace in DataXceiver#run() if 
> the exception was an InvalidToken exception. I think it would be beneficial 
> to log on trace level if the exception's cause was InvalidException. Like in 
> the following case:
> {code:java}
> DataXceiver error processing unknown operation 
> src: xxx dst: xxx
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
> [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block 
> token with block_token_identifier
> (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, 
> blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767210#comment-16767210
 ] 

Kitti Nanasi commented on HDFS-14188:
-

Thanks for the review [~ayushtkn]!
 * It's a good idea to save efforts by running the verify for multiple policies 
at the same time, I implemented it in patch v003.
 * About the test, although it would same some execution time to merge the 
tests together, I think it would be less readable and I would have to clear the 
system out inside the method multiple times, instead of leaving it for the 
tearDown method, which would be a bit confusing. But I am not against merging 
them if you want to save on test execution time, so let me know your opinion on 
that.
 * Thanks for reminding me about the documentation, I extended it.

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-02-13 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.003.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, 
> HDFS-14188.003.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14273) Fix checkstyle issues in BlockLocation's method javadoc

2019-02-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767064#comment-16767064
 ] 

Kitti Nanasi commented on HDFS-14273:
-

Thanks for the patch [~shwetayakkali]!

+1 (non-binding)

> Fix checkstyle issues in BlockLocation's method javadoc
> ---
>
> Key: HDFS-14273
> URL: https://issues.apache.org/jira/browse/HDFS-14273
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Shweta
>Assignee: Shweta
>Priority: Trivial
> Attachments: HDFS-14273.001.patch
>
>
> BlockLocation. java has checkstyle issues for most of methods's javadoc and 
> an indentation error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write

2019-01-31 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757827#comment-16757827
 ] 

Kitti Nanasi commented on HDFS-14187:
-

Thanks for reviewing and committing it [~jojochuang]!

> Make warning message more clear when there are not enough data nodes for EC 
> write
> -
>
> Key: HDFS-14187
> URL: https://issues.apache.org/jira/browse/HDFS-14187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14187.001.patch
>
>
> When setting an erasure coding policy for which there are not enough racks or 
> data nodes, write will fail with the following message:
> {code:java}
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir 
> /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path 
> /user/systest/testdir
> Set default erasure coding policy on /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 
> /user/systest/testdir
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write 
> 2 blocks. It's at high risk of losing data.
> {code}
> I suggest to log a more descriptive message suggesting to use hdfs ec 
> -verifyCluster command to verify the cluster setup against the ec policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14125) Use parameterized log format in ECTopologyVerifier

2019-01-29 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755384#comment-16755384
 ] 

Kitti Nanasi commented on HDFS-14125:
-

Thanks for the comment [~shwetayakkali]!

I used String.format instead of the logger's standard formatter, which accepts 
%s and %d. I used that because I pass that same String in the result as well, 
so I can't use just the logger's formatter.

> Use parameterized log format in ECTopologyVerifier
> --
>
> Key: HDFS-14125
> URL: https://issues.apache.org/jira/browse/HDFS-14125
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.3.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14125.001.patch
>
>
> ECTopologyVerifier introduced in 
> [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a 
> parameterized log format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error

2019-01-25 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14231:

Attachment: HDFS-14231.001.patch

> DataXceiver#run() should not log exceptions caused by InvalidToken exception 
> as an error
> 
>
> Key: HDFS-14231
> URL: https://issues.apache.org/jira/browse/HDFS-14231
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14231.001.patch
>
>
> HDFS-10760 changed the log level from error to trace in DataXceiver#run() if 
> the exception was an InvalidToken exception. I think it would be beneficial 
> to log on trace level if the exception's cause was InvalidException. Like in 
> the following case:
> {code:java}
> DataXceiver error processing unknown operation 
> src: xxx dst: xxx
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
> [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block 
> token with block_token_identifier
> (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, 
> blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error

2019-01-25 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14231:

Status: Patch Available  (was: Open)

> DataXceiver#run() should not log exceptions caused by InvalidToken exception 
> as an error
> 
>
> Key: HDFS-14231
> URL: https://issues.apache.org/jira/browse/HDFS-14231
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14231.001.patch
>
>
> HDFS-10760 changed the log level from error to trace in DataXceiver#run() if 
> the exception was an InvalidToken exception. I think it would be beneficial 
> to log on trace level if the exception's cause was InvalidException. Like in 
> the following case:
> {code:java}
> DataXceiver error processing unknown operation 
> src: xxx dst: xxx
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
> [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block 
> token with block_token_identifier
> (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, 
> blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error

2019-01-25 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14231:
---

 Summary: DataXceiver#run() should not log exceptions caused by 
InvalidToken exception as an error
 Key: HDFS-14231
 URL: https://issues.apache.org/jira/browse/HDFS-14231
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.1.1
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


HDFS-10760 changed the log level from error to trace in DataXceiver#run() if 
the exception was an InvalidToken exception. I think it would be beneficial to 
log on trace level if the exception's cause was InvalidException. Like in the 
following case:
{code:java}
DataXceiver error processing unknown operation 
src: xxx dst: xxx
javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
[Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block 
token with block_token_identifier
(expiryDate=1547593336220, keyId=-1735471718, userId=hbase, blockPoolId=BP-xxx, 
blockId=1245599303, access modes=[READ]) is expired.]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.002.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14125) Use parameterized log format in ECTopologyVerifier

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14125:

Attachment: HDFS-14125.001.patch

> Use parameterized log format in ECTopologyVerifier
> --
>
> Key: HDFS-14125
> URL: https://issues.apache.org/jira/browse/HDFS-14125
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.3.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14125.001.patch
>
>
> ECTopologyVerifier introduced in 
> [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a 
> parameterized log format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14125) Use parameterized log format in ECTopologyVerifier

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14125:

Status: Patch Available  (was: Open)

> Use parameterized log format in ECTopologyVerifier
> --
>
> Key: HDFS-14125
> URL: https://issues.apache.org/jira/browse/HDFS-14125
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.3.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14125.001.patch
>
>
> ECTopologyVerifier introduced in 
> [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a 
> parameterized log format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Status: Patch Available  (was: Open)

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14188:

Attachment: HDFS-14188.001.patch

> Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a 
> parameter
> ---
>
> Key: HDFS-14188
> URL: https://issues.apache.org/jira/browse/HDFS-14188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14188.001.patch
>
>
> hdfs ec -verifyClusterSetup command verifies if there are enough data nodes 
> and racks for the enabled erasure coding policies
> I think it would be beneficial if it could accept an erasure coding policy as 
> a parameter optionally. For example the following command would run the 
> verify for only the RS-6-3-1024k policy.
> {code:java}
> hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14187:

Status: Patch Available  (was: Open)

> Make warning message more clear when there are not enough data nodes for EC 
> write
> -
>
> Key: HDFS-14187
> URL: https://issues.apache.org/jira/browse/HDFS-14187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14187.001.patch
>
>
> When setting an erasure coding policy for which there are not enough racks or 
> data nodes, write will fail with the following message:
> {code:java}
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir 
> /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path 
> /user/systest/testdir
> Set default erasure coding policy on /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 
> /user/systest/testdir
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write 
> 2 blocks. It's at high risk of losing data.
> {code}
> I suggest to log a more descriptive message suggesting to use hdfs ec 
> -verifyCluster command to verify the cluster setup against the ec policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write

2019-01-24 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14187:

Attachment: HDFS-14187.001.patch

> Make warning message more clear when there are not enough data nodes for EC 
> write
> -
>
> Key: HDFS-14187
> URL: https://issues.apache.org/jira/browse/HDFS-14187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14187.001.patch
>
>
> When setting an erasure coding policy for which there are not enough racks or 
> data nodes, write will fail with the following message:
> {code:java}
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir 
> /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path 
> /user/systest/testdir
> Set default erasure coding policy on /user/systest/testdir
> [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 
> /user/systest/testdir
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
> 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write 
> 2 blocks. It's at high risk of losing data.
> {code}
> I suggest to log a more descriptive message suggesting to use hdfs ec 
> -verifyCluster command to verify the cluster setup against the ec policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-14 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741918#comment-16741918
 ] 

Kitti Nanasi commented on HDFS-14061:
-

Thanks for the review [~jojochuang]!

I will clean up the log messages in HDFS-14125, so for now I will leave them as 
is. I addressed your comments in the newest patch.

About using all DataNodes in the verification, we did not consider in 
-HDFS-12946- to use only a subset of all DataNodes, that would make the 
verification a bit stricter and I am fine with that as well, though it would 
not make much difference in my opinion. Let me know if you have any suggestions 
on that.

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch, HDFS-14061.004.patch, HDFS-14061.005.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-14 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Attachment: HDFS-14061.005.patch

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch, HDFS-14061.004.patch, HDFS-14061.005.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-11 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740232#comment-16740232
 ] 

Kitti Nanasi commented on HDFS-14134:
-

Thanks [~lukmajercak] for the work here!
{quote}Also note that previously, if a hedging request got FAILOVER_RETRY and 
some request got SocketExc on nonidempotent operation (e.g. FAIL), the client 
would still pick FAILOVER_RETRY over FAIL, so i think we are fixing an issue 
here as well.
{quote}
Sounds good that you found and fixed this issue as well.

+1 (non-binding) New patch looks good to me!

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-09 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738031#comment-16738031
 ] 

Kitti Nanasi commented on HDFS-14061:
-

Thanks [~shwetayakkali] for the comment! I added messages for asserts in patch 
004.

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch, HDFS-14061.004.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-09 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Attachment: HDFS-14061.004.patch

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch, HDFS-14061.004.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-08 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737125#comment-16737125
 ] 

Kitti Nanasi commented on HDFS-14061:
-

Thanks for the comment [~adam.antal]!

I fixed the renames and added a new test.

The new message is already tested in TestECAdmin, so I don't think it would add 
more value to also test it in TestErasureCodingCLI, and the number of racks and 
data nodes are more difficult to configure there.

You are right, System.err in TestECAdmin.java was only used by patch v001, but 
I think it is worth to keep that check, because ECAdmin can write to System.err.

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-08 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Attachment: HDFS-14061.003.patch

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, 
> HDFS-14061.003.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter

2019-01-07 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14188:
---

 Summary: Make hdfs ec -verifyClusterSetup command accept an 
erasure coding policy as a parameter
 Key: HDFS-14188
 URL: https://issues.apache.org/jira/browse/HDFS-14188
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.1.1
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


hdfs ec -verifyClusterSetup command verifies if there are enough data nodes and 
racks for the enabled erasure coding policies

I think it would be beneficial if it could accept an erasure coding policy as a 
parameter optionally. For example the following command would run the verify 
for only the RS-6-3-1024k policy.
{code:java}
hdfs ec -verifyClusterSetup -policy RS-6-3-1024k
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-07 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735969#comment-16735969
 ] 

Kitti Nanasi commented on HDFS-14061:
-

Thanks for the comment [~ayushtkn]!

You are right, failing the policy setting might be too harsh. Why I wanted to 
do that was because when setting the policy and writing to the folder, the 
error message we get is quite misleading. But I think if that message is 
corrected, it will be fine, so I raised 
[HDFS-14187|https://issues.apache.org/jira/browse/HDFS-14187] for that.

And in patch 002 I fixed your comments.

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-07 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Status: Patch Available  (was: Open)

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-07 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Attachment: HDFS-14061.002.patch

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write

2019-01-07 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14187:
---

 Summary: Make warning message more clear when there are not enough 
data nodes for EC write
 Key: HDFS-14187
 URL: https://issues.apache.org/jira/browse/HDFS-14187
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.1.1
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


When setting an erasure coding policy for which there are not enough racks or 
data nodes, write will fail with the following message:
{code:java}
[root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir 
/user/systest/testdir
[root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path 
/user/systest/testdir
Set default erasure coding policy on /user/systest/testdir
[root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 
/user/systest/testdir
18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity 
block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[]
18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write 2 
blocks. It's at high risk of losing data.
{code}
I suggest to log a more descriptive message suggesting to use hdfs ec 
-verifyCluster command to verify the cluster setup against the ec policies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-04 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734313#comment-16734313
 ] 

Kitti Nanasi commented on HDFS-14061:
-

Patch 001 contains:
 * A warning is shown if the cluster topology verify fails for all enabled 
policies, when enabling a policy.
 * Policy setting fails if the cluster topology verification fails for the 
policy

There could be one concern with the second one, if we want to set RS(6,3) 
policy, the verify will only succeed if there are at least 9 data nodes which 
may be a bit strict. Should we allow setting the policy with less data nodes 
than 9? I would like to hear some opinions about that. 

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it

2019-01-04 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-14061:

Attachment: HDFS-14061.001.patch

> Check if the cluster topology supports the EC policy before setting, enabling 
> or adding it
> --
>
> Key: HDFS-14061
> URL: https://issues.apache.org/jira/browse/HDFS-14061
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Affects Versions: 3.1.1
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-14061.001.patch
>
>
> HDFS-12946 introduces a command for verifying if there are enough racks and 
> datanodes for the enabled erasure coding policies.
> This verification could be executed for the erasure coding policy before 
> enabling, setting or adding it and a warning message could be written if the 
> verify fails, or the policy setting could be even failed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.

2018-12-21 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726756#comment-16726756
 ] 

Kitti Nanasi commented on HDFS-13965:
-

[~lokeskumarp], I think it is possible to fix, but it is not a trivial change, 
so until it is fixed you can work around this problem by setting the KRB5CCNAME 
environment variable to the path of the ticket cache.

> hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS 
> encryption is enabled.
> -
>
> Key: HDFS-13965
> URL: https://issues.apache.org/jira/browse/HDFS-13965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, kms
>Affects Versions: 2.7.3, 2.7.7
>Reporter: LOKESKUMAR VIJAYAKUMAR
>Assignee: Kitti Nanasi
>Priority: Major
>
> _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide 
> a custom kerberos cache path for all hadoop operations to be run as specified 
> user. But this setting is not honored when KMS encryption is enabled._
> _The below program to read a file works when KMS encryption is not enabled, 
> but it fails when the KMS encryption is enabled._
> _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not 
> honored by *createConnection on KMSClientProvider.java.*_
>  
> HadoopTest.java (CLASSPATH needs to be set to compile and run)
>  
> import java.io.InputStream;
> import java.net.URI;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
>  
> public class HadoopTest {
>     public static int runRead(String[] args) throws Exception{
>     if (args.length < 3) {
>     System.err.println("HadoopTest hadoop_file_path 
> hadoop_user kerberos_cache");
>     return 1;
>     }
>     Path inputPath = new Path(args[0]);
>     Configuration conf = new Configuration();
>     URI defaultURI = FileSystem.getDefaultUri(conf);
>     
> conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]);
>     FileSystem fs = 
> FileSystem.newInstance(defaultURI,conf,args[1]);
>     InputStream is = fs.open(inputPath);
>     byte[] buffer = new byte[4096];
>     int nr = is.read(buffer);
>     while (nr != -1)
>     {
>     System.out.write(buffer, 0, nr);
>     nr = is.read(buffer);
>     }
>     return 0;
>     }
>     public static void main( String[] args ) throws Exception {
>     int returnCode = HadoopTest.runRead(args);
>     System.exit(returnCode);
>     }
> }
>  
>  
>  
> [root@lstrost3 testhadoop]# pwd
> /testhadoop
>  
> [root@lstrost3 testhadoop]# ls
> HadoopTest.java
>  
> [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:.
>  
> [root@lstrost3 testhadoop]# javac HadoopTest.java
>  
> [root@lstrost3 testhadoop]# java HadoopTest
> HadoopTest  hadoop_file_path  hadoop_user  kerberos_cache
>  
> [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki 
> /tmp/krb5cc_1006
> 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> Exception in thread "main" java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: *{color:#FF}No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt){color}*
>     at 
> {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color}
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
>     at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
>     at 

[jira] [Commented] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.

2018-12-17 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723104#comment-16723104
 ] 

Kitti Nanasi commented on HDFS-13965:
-

[~jojochuang], you are correct, the problem is that KerberosConfiguration does 
not use the ticket cache set in the configuration.

A workaround for this is that you can set the "KRB5CCNAME" environment variable 
to the ticket cache path. However the root user using the ticket cache of 
another user's to read its encryption zone does not seem like a usual scenario 
to me. You might want to consider running your script in an oozie workflow, 
which can run your script in the name of the other user using delegation 
tokens. [~lokeskumarp], let me know if you have questions.

> hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS 
> encryption is enabled.
> -
>
> Key: HDFS-13965
> URL: https://issues.apache.org/jira/browse/HDFS-13965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, kms
>Affects Versions: 2.7.3, 2.7.7
>Reporter: LOKESKUMAR VIJAYAKUMAR
>Assignee: Kitti Nanasi
>Priority: Major
>
> _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide 
> a custom kerberos cache path for all hadoop operations to be run as specified 
> user. But this setting is not honored when KMS encryption is enabled._
> _The below program to read a file works when KMS encryption is not enabled, 
> but it fails when the KMS encryption is enabled._
> _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not 
> honored by *createConnection on KMSClientProvider.java.*_
>  
> HadoopTest.java (CLASSPATH needs to be set to compile and run)
>  
> import java.io.InputStream;
> import java.net.URI;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
>  
> public class HadoopTest {
>     public static int runRead(String[] args) throws Exception{
>     if (args.length < 3) {
>     System.err.println("HadoopTest hadoop_file_path 
> hadoop_user kerberos_cache");
>     return 1;
>     }
>     Path inputPath = new Path(args[0]);
>     Configuration conf = new Configuration();
>     URI defaultURI = FileSystem.getDefaultUri(conf);
>     
> conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]);
>     FileSystem fs = 
> FileSystem.newInstance(defaultURI,conf,args[1]);
>     InputStream is = fs.open(inputPath);
>     byte[] buffer = new byte[4096];
>     int nr = is.read(buffer);
>     while (nr != -1)
>     {
>     System.out.write(buffer, 0, nr);
>     nr = is.read(buffer);
>     }
>     return 0;
>     }
>     public static void main( String[] args ) throws Exception {
>     int returnCode = HadoopTest.runRead(args);
>     System.exit(returnCode);
>     }
> }
>  
>  
>  
> [root@lstrost3 testhadoop]# pwd
> /testhadoop
>  
> [root@lstrost3 testhadoop]# ls
> HadoopTest.java
>  
> [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:.
>  
> [root@lstrost3 testhadoop]# javac HadoopTest.java
>  
> [root@lstrost3 testhadoop]# java HadoopTest
> HadoopTest  hadoop_file_path  hadoop_user  kerberos_cache
>  
> [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki 
> /tmp/krb5cc_1006
> 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> Exception in thread "main" java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: *{color:#FF}No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt){color}*
>     at 
> {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color}
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
>     at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463)
>     at 
> 

[jira] [Commented] (HDFS-14132) Add BlockLocation.isStriped() to determine if block is replicated or Striped

2018-12-14 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721083#comment-16721083
 ] 

Kitti Nanasi commented on HDFS-14132:
-

Thanks [~shwetayakkali] for the patch! Looks good to me.

+1 (non-binding)

> Add BlockLocation.isStriped() to determine if block is replicated or Striped
> 
>
> Key: HDFS-14132
> URL: https://issues.apache.org/jira/browse/HDFS-14132
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Major
> Attachments: HDFS-14132.001.patch
>
>
> Impala uses FileSystem#getBlockLocation to get block locations. We can add 
> isStriped() method for it to easier determine the block is belonged to 
> replicated file or striped file.
> In HDFS, this isStriped information is already in 
> HdfsBlockLocation#LocatedBlock#isStriped(), adding this method to 
> BlockLocation does not introduce space overhead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14132) Add BlockLocation.isStriped() to determine if block is replicated or Striped

2018-12-14 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721083#comment-16721083
 ] 

Kitti Nanasi edited comment on HDFS-14132 at 12/14/18 9:12 AM:
---

Thanks [~shwetayakkali] for the patch! Looks good to me and the test failures 
do not seem related.

+1 (non-binding)


was (Author: knanasi):
Thanks [~shwetayakkali] for the patch! Looks good to me.

+1 (non-binding)

> Add BlockLocation.isStriped() to determine if block is replicated or Striped
> 
>
> Key: HDFS-14132
> URL: https://issues.apache.org/jira/browse/HDFS-14132
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Major
> Attachments: HDFS-14132.001.patch
>
>
> Impala uses FileSystem#getBlockLocation to get block locations. We can add 
> isStriped() method for it to easier determine the block is belonged to 
> replicated file or striped file.
> In HDFS, this isStriped information is already in 
> HdfsBlockLocation#LocatedBlock#isStriped(), adding this method to 
> BlockLocation does not introduce space overhead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720572#comment-16720572
 ] 

Kitti Nanasi commented on HDFS-14134:
-

The relevant part is the following:
{quote}in FailoverOnNetworkExceptionRetry#shouldRetry we don't fail-over and 
retry if we're making a non-idempotent call and there's an IOException or 
SocketException that's not Connect, NoRouteToHost, UnknownHost, or Standby. The 
rationale of course is that the operation may have reached the server and 
retrying elsewhere could leave us in an insconsistent state. This means if a 
client doing a create/delete which gets a SocketTimeoutException (which is an 
IOE) or an EOF SocketException the exception will be thrown all the way up to 
the caller of FileSystem/FileContext. That's reasonable because only the user 
of the API at this level has sufficient knoweldge of how to handle the failure, 
eg if they get such an exception after issuing a delete they can check if the 
file still exists and if so re-issue the delete (however they may also not want 
to do this, and FileContext doesn't know which).
{quote}

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720563#comment-16720563
 ] 

Kitti Nanasi commented on HDFS-14134:
-

Yes, this change covers that, I just wanted to understand why you changed it 
like that, but we're pretty much on the same page now.

I have only one concern, which is the case of non-remote IOExceptions on 
non-idempotent operations, I'm not sure if retrying those will cause any 
problems. For reference there is a discussion on 
[HADOOP-7380|https://issues.apache.org/jira/browse/HADOOP-7380] on why it was 
introduced.

Other than that patch v5 looks good.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14121) Log message about the old hosts file format is misleading

2018-12-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720266#comment-16720266
 ] 

Kitti Nanasi commented on HDFS-14121:
-

Thanks [~zvenczel] for the new patch! It looks good to me.

+1 (non-binding)

> Log message about the old hosts file format is misleading
> -
>
> Key: HDFS-14121
> URL: https://issues.apache.org/jira/browse/HDFS-14121
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-14121.01.patch, HDFS-14121.02.patch
>
>
> In {{CombinedHostsFileReader.readFile()}} we have the following:
> {code}  LOG.warn("{} has invalid JSON format." +
>   "Try the old format without top-level token defined.", 
> hostsFile);{code}
> That message is trying to say that we tried parsing the hosts file as a 
> well-formed JSON file and failed, so we're going to try again assuming that 
> it's in the old badly-formed format.  What it actually says is that the hosts 
> fie is bad, and the admin should try switching to the old format.  Those are 
> two very different things.
> While were in there, we should refactor the logging so that instead of 
> reporting that we're going to try using a different parser (who the heck 
> cares?), we report that the we had to use the old parser to successfully 
> parse the hosts file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720204#comment-16720204
 ] 

Kitti Nanasi commented on HDFS-14134:
-

I totally agree with you that retrying getXAttr on "attr could not find" 
IOException is not good and wasteful, and that we have to have a better concept 
than the current.

But we also have to keep in mind that the FailoverOnNetworkExceptionRetry 
policy is used by many parts of the code and it is a bit risky to change it. I 
think the idea behind the previous design is that non remote IOExceptions may 
be network related exceptions, so it is worth to retry them if the operation is 
idempotent.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-12 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718752#comment-16718752
 ] 

Kitti Nanasi commented on HDFS-14134:
-

[~lukmajercak], you are correct on the definition of idempotency. I think the 
original approach in retrying was that idempotent operations don't change 
internal state, so it is safe to retry them. For example if you just get a 
value, it is always safe to retry, but if you renew a delegation token, it is a 
more complex question if it is safe to retry that or not, because maybe the 
renewal already took place before failing, maybe not, and if it already took 
place, is it safe to renew it again?

 

By the way idempotency originally was only considered in case of non-remote 
IOExceptions, why did that change?

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-11 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716690#comment-16716690
 ] 

Kitti Nanasi commented on HDFS-14134:
-

Thanks for the new patch [~lukmajercak]!

It looks better regarding retrying on non-remote IOExceptions, but there is one 
thing I don't understand, which I think is wrong in the pdf as well. In case of 
remote IOException, we should retry if the operation is idempotent, and not the 
opposite. 
 So instead of this code:
{code:java}
else if (e instanceof IOException) {
if (e instanceof RemoteException && isIdempotentOrAtMostOnce) {
  return new RetryAction(RetryAction.RetryDecision.FAIL, 0,
  "Remote exception and the invoked method is idempotent " +
  "or at most once.");
}
return new RetryAction(RetryAction.RetryDecision.FAILOVER_AND_RETRY,
getFailoverOrRetrySleepTime(failovers));
}
{code}
I think it should look like this:
{code:java}
else if (e instanceof IOException) {
if (e instanceof RemoteException && !isIdempotentOrAtMostOnce) {
  return new RetryAction(RetryAction.RetryDecision.FAIL, 0,
  "Remote exception and the invoked method is idempotent " +
  "or at most once.");
}
return new RetryAction(RetryAction.RetryDecision.FAILOVER_AND_RETRY,
getFailoverOrRetrySleepTime(failovers));
}
{code}
What do you think [~lukmajercak]?
  

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714464#comment-16714464
 ] 

Kitti Nanasi commented on HDFS-14134:
-

Thanks [~lukmajercak] for the patch!

The proposed solution in the pdf seems good to me, but looking at the code, the 
retry does not happen on non-remote IOExceptions at all, which is not the same 
behaviour as described in the pdf. Also 
TestLoadBalancingKMSClientProvider#testClientRetriesIdempotentOpWithIOExceptionSucceedsSecondTime
 fails because of that.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14121) Log message about the old hosts file format is misleading

2018-12-07 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712817#comment-16712817
 ] 

Kitti Nanasi commented on HDFS-14121:
-

Thanks [~zvenczel] for the patch!

The patch overall looks good to me. I only have one minor comment, that the 
warning message about the empty file content could be more descriptive.

> Log message about the old hosts file format is misleading
> -
>
> Key: HDFS-14121
> URL: https://issues.apache.org/jira/browse/HDFS-14121
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-14121.01.patch
>
>
> In {{CombinedHostsFileReader.readFile()}} we have the following:
> {code}  LOG.warn("{} has invalid JSON format." +
>   "Try the old format without top-level token defined.", 
> hostsFile);{code}
> That message is trying to say that we tried parsing the hosts file as a 
> well-formed JSON file and failed, so we're going to try again assuming that 
> it's in the old badly-formed format.  What it actually says is that the hosts 
> fie is bad, and the admin should try switching to the old format.  Those are 
> two very different things.
> While were in there, we should refactor the logging so that instead of 
> reporting that we're going to try using a different parser (who the heck 
> cares?), we report that the we had to use the old parser to successfully 
> parse the hosts file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.

2018-12-04 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi reassigned HDFS-13965:
---

Assignee: Kitti Nanasi

> hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS 
> encryption is enabled.
> -
>
> Key: HDFS-13965
> URL: https://issues.apache.org/jira/browse/HDFS-13965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, kms
>Affects Versions: 2.7.3, 2.7.7
>Reporter: LOKESKUMAR VIJAYAKUMAR
>Assignee: Kitti Nanasi
>Priority: Major
>
> _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide 
> a custom kerberos cache path for all hadoop operations to be run as specified 
> user. But this setting is not honored when KMS encryption is enabled._
> _The below program to read a file works when KMS encryption is not enabled, 
> but it fails when the KMS encryption is enabled._
> _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not 
> honored by *createConnection on KMSClientProvider.java.*_
>  
> HadoopTest.java (CLASSPATH needs to be set to compile and run)
>  
> import java.io.InputStream;
> import java.net.URI;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
>  
> public class HadoopTest {
>     public static int runRead(String[] args) throws Exception{
>     if (args.length < 3) {
>     System.err.println("HadoopTest hadoop_file_path 
> hadoop_user kerberos_cache");
>     return 1;
>     }
>     Path inputPath = new Path(args[0]);
>     Configuration conf = new Configuration();
>     URI defaultURI = FileSystem.getDefaultUri(conf);
>     
> conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]);
>     FileSystem fs = 
> FileSystem.newInstance(defaultURI,conf,args[1]);
>     InputStream is = fs.open(inputPath);
>     byte[] buffer = new byte[4096];
>     int nr = is.read(buffer);
>     while (nr != -1)
>     {
>     System.out.write(buffer, 0, nr);
>     nr = is.read(buffer);
>     }
>     return 0;
>     }
>     public static void main( String[] args ) throws Exception {
>     int returnCode = HadoopTest.runRead(args);
>     System.exit(returnCode);
>     }
> }
>  
>  
>  
> [root@lstrost3 testhadoop]# pwd
> /testhadoop
>  
> [root@lstrost3 testhadoop]# ls
> HadoopTest.java
>  
> [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:.
>  
> [root@lstrost3 testhadoop]# javac HadoopTest.java
>  
> [root@lstrost3 testhadoop]# java HadoopTest
> HadoopTest  hadoop_file_path  hadoop_user  kerberos_cache
>  
> [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki 
> /tmp/krb5cc_1006
> 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> Exception in thread "main" java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: *{color:#FF}No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt){color}*
>     at 
> {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color}
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
>     at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
>     at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
>     at HadoopTest.runRead(HadoopTest.java:18)
>     at HadoopTest.main(HadoopTest.java:29)
> Caused by: 
> 

[jira] [Commented] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-12-04 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708567#comment-16708567
 ] 

Kitti Nanasi commented on HDFS-12946:
-

Thanks [~jojochuang] for reviewing and committing! I created HDFS-14125 to 
change the logs to use parameterized log format.

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, 
> HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, 
> HDFS-12946.09.patch, HDFS-12946.10.patch, HDFS-12946.11.patch, 
> HDFS-12946.12.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14125) Use parameterized log format in ECTopologyVerifier

2018-12-04 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14125:
---

 Summary: Use parameterized log format in ECTopologyVerifier
 Key: HDFS-14125
 URL: https://issues.apache.org/jira/browse/HDFS-14125
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.3.0
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


ECTopologyVerifier introduced in 
[HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a 
parameterized log format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies

2018-12-03 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706913#comment-16706913
 ] 

Kitti Nanasi edited comment on HDFS-14113 at 12/3/18 9:56 AM:
--

Thanks [~ayushtkn] for the patch!

The patch overall looks good to me. There are only some checkstyle issues, and 
I think the following assert in TestErasureCodingAddConfig has an empty message 
by mistake.
{code:java}
assertNull("", response[0].getErrorMsg());
{code}


was (Author: knanasi):
Thanks [~ayushtkn] for the patch!

The patch overall looks good to me. There are only some checkstyle issues, and 
I think the following assert has an empty message by mistake.
{code:java}
assertNull("", response[0].getErrorMsg());
{code}

> EC : Add Configuration to restrict UserDefined Policies
> ---
>
> Key: HDFS-14113
> URL: https://issues.apache.org/jira/browse/HDFS-14113
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14113-01.patch
>
>
> By default addition of erasure coding policies is enabled for users.We need 
> to add configuration whether to allow addition of new User Defined policies 
> or not.Which can be configured in for of a Boolean value at the server side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies

2018-12-03 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706913#comment-16706913
 ] 

Kitti Nanasi commented on HDFS-14113:
-

Thanks [~ayushtkn] for the patch!

The patch overall looks good to me. There are only some checkstyle issues, and 
I think the following assert has an empty message by mistake.
{code:java}
assertNull("", response[0].getErrorMsg());
{code}

> EC : Add Configuration to restrict UserDefined Policies
> ---
>
> Key: HDFS-14113
> URL: https://issues.apache.org/jira/browse/HDFS-14113
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14113-01.patch
>
>
> By default addition of erasure coding policies is enabled for users.We need 
> to add configuration whether to allow addition of new User Defined policies 
> or not.Which can be configured in for of a Boolean value at the server side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE

2018-12-03 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706834#comment-16706834
 ] 

Kitti Nanasi commented on HDFS-14081:
-

Thanks [~shwetayakkali] for the new patch!

+1 (non-binding)

> hdfs dfsadmin -metasave metasave_test results NPE
> -
>
> Key: HDFS-14081
> URL: https://issues.apache.org/jira/browse/HDFS-14081
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Shweta
>Assignee: Shweta
>Priority: Major
> Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, 
> HDFS-14081.003.patch, HDFS-14081.004.patch
>
>
> Race condition is encountered while adding Block to 
> postponedMisreplicatedBlocks which in turn tried to retrieve Block from 
> BlockManager in which it may not be present. 
> This happens in HA, metasave in first NN succeeded but failed in second NN, 
> StackTrace showing NPE is as follows:
> {code}
> 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: 
> IPC Server handler 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:60234java.lang.NullPointerException at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE

2018-11-30 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704762#comment-16704762
 ] 

Kitti Nanasi commented on HDFS-14081:
-

Thanks for the new patch [~shwetayakkali]! You can use the println method 
instead of printing "\n" separately at the end, but it's just a really minor 
issue. Looks good other than that.

> hdfs dfsadmin -metasave metasave_test results NPE
> -
>
> Key: HDFS-14081
> URL: https://issues.apache.org/jira/browse/HDFS-14081
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Shweta
>Assignee: Shweta
>Priority: Major
> Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, 
> HDFS-14081.003.patch
>
>
> Race condition is encountered while adding Block to 
> postponedMisreplicatedBlocks which in turn tried to retrieve Block from 
> BlockManager in which it may not be present. 
> This happens in HA, metasave in first NN succeeded but failed in second NN, 
> StackTrace showing NPE is as follows:
> {code}
> 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: 
> IPC Server handler 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:60234java.lang.NullPointerException at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14115) TestNamenodeCapacityReport#testXceiverCount is flaky

2018-11-29 Thread Kitti Nanasi (JIRA)
Kitti Nanasi created HDFS-14115:
---

 Summary: TestNamenodeCapacityReport#testXceiverCount is flaky
 Key: HDFS-14115
 URL: https://issues.apache.org/jira/browse/HDFS-14115
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: Kitti Nanasi
Assignee: Kitti Nanasi


TestNamenodeCapacityReport#testXceiverCount sometimes fails with the following 
error:

{code}
2018-11-28 17:33:45,816 INFO  DataNode - PacketResponder: 
BP-645736292-172.17.0.2-1543426416580:blk_1073741828_1004, 
type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:37115, 
127.0.0.1:35107] terminating
2018-11-28 17:33:45,817 INFO  StateChange - DIR* completeFile: /f3 is closed by 
DFSClient_NONMAPREDUCE_1933849415_1
2018-11-28 17:33:45,817 INFO  ExitUtil - Exiting with status 1: Block report 
processor encountered fatal exception: java.lang.AssertionError: Negative 
replicas!
2018-11-28 17:33:45,818 ERROR ExitUtil - Terminate called
1: Block report processor encountered fatal exception: 
java.lang.AssertionError: Negative replicas!
at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807)
Exception in thread "Block report processor" 1: Block report processor 
encountered fatal exception: java.lang.AssertionError: Negative replicas!
at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE

2018-11-28 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701670#comment-16701670
 ] 

Kitti Nanasi commented on HDFS-14081:
-

Thanks [~shwetayakkali] for the patch!

The change looks good to me, printing a log is definitely better than throwing 
a NPE.

I just have one minor comment, the two print statements could be merges as one, 
like this:

{code:java}
out.println("Block "+ block + " is Null");
{code}
 
+1 (non-binding) pending on that

> hdfs dfsadmin -metasave metasave_test results NPE
> -
>
> Key: HDFS-14081
> URL: https://issues.apache.org/jira/browse/HDFS-14081
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Shweta
>Assignee: Shweta
>Priority: Major
> Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch
>
>
> Race condition is encountered while adding Block to 
> postponedMisreplicatedBlocks which in turn tried to retrieve Block from 
> BlockManager in which it may not be present. 
> This happens in HA, metasave in first NN succeeded but failed in second NN, 
> StackTrace showing NPE is as follows:
> {code}
> 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: 
> IPC Server handler 24 on 8020, call Call#1 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
> 172.26.9.163:60234java.lang.NullPointerException at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-11-28 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-12946:

Attachment: HDFS-12946.11.patch

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, 
> HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, 
> HDFS-12946.09.patch, HDFS-12946.10.patch, HDFS-12946.11.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-11-27 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-12946:

Attachment: HDFS-12946.10.patch

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, 
> HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, 
> HDFS-12946.09.patch, HDFS-12946.10.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-11-27 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700402#comment-16700402
 ] 

Kitti Nanasi commented on HDFS-12946:
-

Thanks [~xiaochen] for the comments!

In patch v009 I fixed the comments and modified 
FSNamesystem#getVerifyECWithTopologyResult's return type to String to match the 
format of the other entries in the name node jmx.

I created HDFS-14061 for running the topology check in 
FSN#enableErasureCodingPolicy.

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, 
> HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, 
> HDFS-12946.09.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-11-27 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-12946:

Attachment: HDFS-12946.09.patch

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, 
> HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, 
> HDFS-12946.09.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14050) Use parameterized logging construct in NamenodeFsck class

2018-11-20 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693002#comment-16693002
 ] 

Kitti Nanasi commented on HDFS-14050:
-

Thanks [~hgadre] for the new patch!

+1 (non binding) pending on the checkstyle issues.

> Use parameterized logging construct in NamenodeFsck class
> -
>
> Key: HDFS-14050
> URL: https://issues.apache.org/jira/browse/HDFS-14050
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
>Priority: Trivial
> Attachments: HDFS-14050-001.patch, HDFS-14050-002.patch, 
> HDFS-14050-003.patch, HDFS-14050-004.patch
>
>
> HDFS-13695 implemented a change to use slf4j logger (instead of commons 
> logging). But the NamenodeFsck class is still not using parameterized logging 
> construct. This came up during the code review for HADOOP-11391. We should 
> change logging statements in NamenodeFsck to use slf4j style parameterized 
> logging apis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-15 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687873#comment-16687873
 ] 

Kitti Nanasi commented on HDFS-14054:
-

Thanks [~zvenczel] for the patch and [~elgoiri] for the review! The change 
looks good to me, too.
+1 (non binding)

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14064) WEBHDFS: Support Enable/Disable EC Policy

2018-11-13 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685000#comment-16685000
 ] 

Kitti Nanasi commented on HDFS-14064:
-

Thanks for the new patch [~ayushtkn]!

Iterating through the policies in the tests could be organised into a function 
for better readability.
+1 (non binding) pending on that


> WEBHDFS: Support Enable/Disable EC Policy
> -
>
> Key: HDFS-14064
> URL: https://issues.apache.org/jira/browse/HDFS-14064
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14064-01.patch, HDFS-14064-02.patch, 
> HDFS-14064-03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14064) WEBHDFS: Support Enable/Disable EC Policy

2018-11-12 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683591#comment-16683591
 ] 

Kitti Nanasi commented on HDFS-14064:
-

Thanks [~ayushtkn] for working on this!
The code looks good to me, I just have minor comments about the tests:
- I think the IOException shouldn't be caught in the tests, because it is not 
expected and it will hide actual errors.
- The test case should fail if the policy is not found instead of silently 
succeeding
- I would do an assertion after the disablePolicy to make sure that we are 
really running the test on a disabled policy. For example if the default policy 
couldn't be disabled (which is not true currently), the enable policy test 
would succeed, but wouldn't really test anything. 




> WEBHDFS: Support Enable/Disable EC Policy
> -
>
> Key: HDFS-14064
> URL: https://issues.apache.org/jira/browse/HDFS-14064
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14064-01.patch, HDFS-14064-02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14060) HDFS fetchdt command to return error codes on success/failure

2018-11-09 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi reassigned HDFS-14060:
---

Assignee: Kitti Nanasi

> HDFS fetchdt command to return error codes on success/failure
> -
>
> Key: HDFS-14060
> URL: https://issues.apache.org/jira/browse/HDFS-14060
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Kitti Nanasi
>Priority: Major
>
> The {{hdfs fetchdt}} command always returns 0, even when there's been an 
> error (no token issued, no file to load, usage, etc). This means its not that 
> useful as a command line tool for testing or in scripts.
> Proposed: exit non-zero for errors; reuse LaucherExitCodes for these



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   >