[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"
[ https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830266#comment-16830266 ] Kitti Nanasi commented on HDFS-13933: - I was not correct in my previous comment, looking into it a bit more, these tests fail because of a "javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated" exception from where sslSession.getPeerCertificates() is invoked (it is used in 3 different places in our code). I think it is because of the following bug in OpenJDK: [https://bugs.openjdk.java.net/browse/JDK-8212885] [https://bugs.openjdk.java.net/browse/JDK-8220723] The issue affects OpenJDK 11.0.2 and it seems like it was backported to OpenJDK 11.0.3 and OpenJDK 12.0.1. I verified that these tests pass with OpenJDK 12.0.1. > [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification > problems for "localhost" > -- > > Key: HDFS-13933 > URL: https://issues.apache.org/jira/browse/HDFS-13933 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Andrew Purtell >Priority: Minor > > Tests with issues: > * TestHttpFSFWithSWebhdfsFileSystem > * TestWebHdfsTokens > * TestSWebHdfsFileContextMainOperations > Possibly others. Failure looks like > {noformat} > java.io.IOException: localhost:50260: HTTPS hostname wrong: should be > > {noformat} > These tests set up a trust store and use HTTPS connections, and with Java 11 > the client validation of the server name in the generated self-signed > certificate is failing. Exceptions originate in the JRE's HTTP client > library. How everything hooks together uses static initializers, static > methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. > This is Java 11+28 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"
[ https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826978#comment-16826978 ] Kitti Nanasi commented on HDFS-13933: - The affected tests all use HttpsURLConnection and HttpURLConnection classes that have a better alternative in JDK 11. We might need to use the new HttpClient instead. But let's see if we can fix the current implementation first. Related article: [https://dzone.com/articles/java-11-standardized-http-client-api] > [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification > problems for "localhost" > -- > > Key: HDFS-13933 > URL: https://issues.apache.org/jira/browse/HDFS-13933 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Andrew Purtell >Priority: Minor > > Tests with issues: > * TestHttpFSFWithSWebhdfsFileSystem > * TestWebHdfsTokens > * TestSWebHdfsFileContextMainOperations > Possibly others. Failure looks like > {noformat} > java.io.IOException: localhost:50260: HTTPS hostname wrong: should be > > {noformat} > These tests set up a trust store and use HTTPS connections, and with Java 11 > the client validation of the server name in the generated self-signed > certificate is failing. Exceptions originate in the JRE's HTTP client > library. How everything hooks together uses static initializers, static > methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. > This is Java 11+28 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"
[ https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826866#comment-16826866 ] Kitti Nanasi commented on HDFS-13933: - Thanks [~apurtell] for reporting this issue and [~smeng] for the further details! It seems like all three tests fail with OpenJDK 11, but they succeed with Zulu JDK 11. > [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification > problems for "localhost" > -- > > Key: HDFS-13933 > URL: https://issues.apache.org/jira/browse/HDFS-13933 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Andrew Purtell >Priority: Minor > > Tests with issues: > * TestHttpFSFWithSWebhdfsFileSystem > * TestWebHdfsTokens > * TestSWebHdfsFileContextMainOperations > Possibly others. Failure looks like > {noformat} > java.io.IOException: localhost:50260: HTTPS hostname wrong: should be > > {noformat} > These tests set up a trust store and use HTTPS connections, and with Java 11 > the client validation of the server name in the generated self-signed > certificate is failing. Exceptions originate in the JRE's HTTP client > library. How everything hooks together uses static initializers, static > methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. > This is Java 11+28 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817916#comment-16817916 ] Kitti Nanasi commented on HDDS-1192: [~elek], could you review this? > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, > HDDS-1192.003.patch, HDDS-1192.004.patch, HDDS-1192.005.patch > > Time Spent: 1h > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Attachment: HDDS-1192.005.patch > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, > HDDS-1192.003.patch, HDDS-1192.004.patch, HDDS-1192.005.patch > > Time Spent: 1h > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817679#comment-16817679 ] Kitti Nanasi commented on HDFS-14353: - Thanks [~maobaolong] for reporting this issue and providing a fix! Could you add a unit test? I think TestReconstructStripedFile could be extended with this check. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Attachment: HDDS-1192.004.patch > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, > HDDS-1192.003.patch, HDDS-1192.004.patch > > Time Spent: 1h > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Attachment: HDDS-1192.003.patch > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch, > HDDS-1192.003.patch > > Time Spent: 50m > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815660#comment-16815660 ] Kitti Nanasi commented on HDDS-1192: I fixed the findbugs issue in patch v002. The test failures are not related. > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch > > Time Spent: 40m > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Attachment: HDDS-1192.002.patch > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch, HDDS-1192.002.patch > > Time Spent: 40m > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14060) HDFS fetchdt command to return error codes on success/failure
[ https://issues.apache.org/jira/browse/HDFS-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi reassigned HDFS-14060: --- Assignee: (was: Kitti Nanasi) > HDFS fetchdt command to return error codes on success/failure > - > > Key: HDFS-14060 > URL: https://issues.apache.org/jira/browse/HDFS-14060 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Priority: Major > > The {{hdfs fetchdt}} command always returns 0, even when there's been an > error (no token issued, no file to load, usage, etc). This means its not that > useful as a command line tool for testing or in scripts. > Proposed: exit non-zero for errors; reuse LaucherExitCodes for these -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14115) TestNamenodeCapacityReport#testXceiverCount is flaky
[ https://issues.apache.org/jira/browse/HDFS-14115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi reassigned HDFS-14115: --- Assignee: (was: Kitti Nanasi) > TestNamenodeCapacityReport#testXceiverCount is flaky > > > Key: HDFS-14115 > URL: https://issues.apache.org/jira/browse/HDFS-14115 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Priority: Major > > TestNamenodeCapacityReport#testXceiverCount sometimes fails with the > following error: > {code} > 2018-11-28 17:33:45,816 INFO DataNode - PacketResponder: > BP-645736292-172.17.0.2-1543426416580:blk_1073741828_1004, > type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:37115, > 127.0.0.1:35107] terminating > 2018-11-28 17:33:45,817 INFO StateChange - DIR* completeFile: /f3 is closed > by DFSClient_NONMAPREDUCE_1933849415_1 > 2018-11-28 17:33:45,817 INFO ExitUtil - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError: Negative > replicas! > 2018-11-28 17:33:45,818 ERROR ExitUtil - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError: Negative replicas! > at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError: Negative replicas! > at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813412#comment-16813412 ] Kitti Nanasi commented on HDDS-1192: Patch v001 contains the following modifications, I did some additional refactoring for the unit tests to pass: * GenericCli now accepts -conf argument * Usage field is removed from MissingSubcommandException, because now it derives from CommandLine.ParameterException, which means that when running the commands with the default exception handler (the default exception handler is used everywhere except in the tests), picocli will print the usage. * The startup message is written inside the ozone 'datanode command' and not in the constructor, so if the command is invalid, it won't print the startup message. * I modified that if 'ozone datanode' command is ran by itself, it will not throw an invalid command exception, it will only fail if it is used with an invalid argument (like 'ozone datanode -invalidArg'). Let me know if that is not ok. Note that the subcommands, like volume and bucket, are not deriving from GenericCli, so the -conf parameter has to come before those commands like this: 'ozone sh -conf conf volume...' > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Status: Patch Available (was: Open) > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1192.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1192: --- Attachment: HDDS-1192.001.patch > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie > Attachments: HDDS-1192.001.patch > > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO
[ https://issues.apache.org/jira/browse/HDFS-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14404: Status: Patch Available (was: Open) > Reduce KMS error logging severity from WARN to INFO > --- > > Key: HDFS-14404 > URL: https://issues.apache.org/jira/browse/HDFS-14404 > Project: Hadoop HDFS > Issue Type: Improvement > Components: kms >Affects Versions: 3.2.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Attachments: HDFS-14404.001.patch > > > When the KMS is deployed as an HA service and a failure occurs the current > error severity in the client code appears to be WARN. It can result in > excessive errors despite the fact that another instance may succeed. > Maybe this log level can be adjusted in only the load balancing provider. > {code} > 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at > [https://example.com:16000/kms/v1/] threw an IOException > [java.net.ConnectException: Connection refused (Connection refused)]!! > 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at > [https://example.com:16000/kms/v1/] threw an IOException: > java.io.IOException: java.lang.reflect.UndeclaredThrowableException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO
[ https://issues.apache.org/jira/browse/HDFS-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14404: Attachment: HDFS-14404.001.patch > Reduce KMS error logging severity from WARN to INFO > --- > > Key: HDFS-14404 > URL: https://issues.apache.org/jira/browse/HDFS-14404 > Project: Hadoop HDFS > Issue Type: Improvement > Components: kms >Affects Versions: 3.2.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Attachments: HDFS-14404.001.patch > > > When the KMS is deployed as an HA service and a failure occurs the current > error severity in the client code appears to be WARN. It can result in > excessive errors despite the fact that another instance may succeed. > Maybe this log level can be adjusted in only the load balancing provider. > {code} > 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at > [https://example.com:16000/kms/v1/] threw an IOException > [java.net.ConnectException: Connection refused (Connection refused)]!! > 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at > [https://example.com:16000/kms/v1/] threw an IOException: > java.io.IOException: java.lang.reflect.UndeclaredThrowableException > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO
Kitti Nanasi created HDFS-14404: --- Summary: Reduce KMS error logging severity from WARN to INFO Key: HDFS-14404 URL: https://issues.apache.org/jira/browse/HDFS-14404 Project: Hadoop HDFS Issue Type: Improvement Components: kms Affects Versions: 3.2.0 Reporter: Kitti Nanasi Assignee: Kitti Nanasi When the KMS is deployed as an HA service and a failure occurs the current error severity in the client code appears to be WARN. It can result in excessive errors despite the fact that another instance may succeed. Maybe this log level can be adjusted in only the load balancing provider. {code} 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at [https://example.com:16000/kms/v1/] threw an IOException [java.net.ConnectException: Connection refused (Connection refused)]!! 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at [https://example.com:16000/kms/v1/] threw an IOException: java.io.IOException: java.lang.reflect.UndeclaredThrowableException {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1153) Make tracing instrumentation configurable
[ https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1153: --- Attachment: HDDS-1153.001.patch > Make tracing instrumentation configurable > - > > Key: HDDS-1153 > URL: https://issues.apache.org/jira/browse/HDDS-1153 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1153.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > TracingUtil.createProxy is a helper method to create a proxy instance with > tracing support. > The proxy instance implements the same interface as the original class and > delegates all the method calls to the original instance but it also sends > tracing information to the tracing server. > By default it's not a big overhead as the tracing libraries can be configured > to send tracing only with some low probability. > But to make it more safe we can make it optional. With a global > 'hdds.tracing.enabled' configuration variable (can be true by default) we can > adjust the behavior of TracingUtil.createProxy. > If the tracing is disabled the TracingUtil.createProxy should return with the > 'delegate' parameter instead of a proxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1153) Make tracing instrumentation configurable
[ https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDDS-1153: --- Status: Patch Available (was: Open) > Make tracing instrumentation configurable > - > > Key: HDDS-1153 > URL: https://issues.apache.org/jira/browse/HDDS-1153 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie, pull-request-available > Attachments: HDDS-1153.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > TracingUtil.createProxy is a helper method to create a proxy instance with > tracing support. > The proxy instance implements the same interface as the original class and > delegates all the method calls to the original instance but it also sends > tracing information to the tracing server. > By default it's not a big overhead as the tracing libraries can be configured > to send tracing only with some low probability. > But to make it more safe we can make it optional. With a global > 'hdds.tracing.enabled' configuration variable (can be true by default) we can > adjust the behavior of TracingUtil.createProxy. > If the tracing is disabled the TracingUtil.createProxy should return with the > 'delegate' parameter instead of a proxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1153) Make tracing instrumentation configurable
[ https://issues.apache.org/jira/browse/HDDS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806684#comment-16806684 ] Kitti Nanasi commented on HDDS-1153: I think that's a good idea, [~jojochuang], I created HDDS-1364 for it. > Make tracing instrumentation configurable > - > > Key: HDDS-1153 > URL: https://issues.apache.org/jira/browse/HDDS-1153 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie > > TracingUtil.createProxy is a helper method to create a proxy instance with > tracing support. > The proxy instance implements the same interface as the original class and > delegates all the method calls to the original instance but it also sends > tracing information to the tracing server. > By default it's not a big overhead as the tracing libraries can be configured > to send tracing only with some low probability. > But to make it more safe we can make it optional. With a global > 'hdds.tracing.enabled' configuration variable (can be true by default) we can > adjust the behavior of TracingUtil.createProxy. > If the tracing is disabled the TracingUtil.createProxy should return with the > 'delegate' parameter instead of a proxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1364) Make OpenTracing implementation configurable
Kitti Nanasi created HDDS-1364: -- Summary: Make OpenTracing implementation configurable Key: HDDS-1364 URL: https://issues.apache.org/jira/browse/HDDS-1364 Project: Hadoop Distributed Data Store Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Kitti Nanasi This issue is coming from [HDDS-1153|https://issues.apache.org/jira/browse/HDDS-1153]. Currently Jaeger OpenTracing implementation is used. It would be nice if we could configure the OpenTracing implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1192) Support -conf command line argument in GenericCli
[ https://issues.apache.org/jira/browse/HDDS-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi reassigned HDDS-1192: -- Assignee: Kitti Nanasi > Support -conf command line argument in GenericCli > - > > Key: HDDS-1192 > URL: https://issues.apache.org/jira/browse/HDDS-1192 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Kitti Nanasi >Priority: Major > Labels: newbie > > org.apache.hadoop.hdds.GenericCli is the based class for all the Ozone > related command line application. It supports to define custom configuration > variables (-D or --set) but doesn't support the '--conf ozone-site.xml' > argument to load an external xml file to the configuration. > Configuration and OzoneConfiguration classes load the ozone-site.xml from the > classpath. But it makes very hard to start Ozone components in IDE as we > can't modify the classpath easily. > One option here is to support the --conf everywhere to make it possible to > start ozone cluster in the IDE. > Note: It's a nice to have for 0.4.0. I marked it as 0.5.0 but safe to commit > at anytime to 0.4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14308) DFSStripedInputStream should implement unbuffer()
[ https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777837#comment-16777837 ] Kitti Nanasi commented on HDFS-14308: - Thanks [~joemcdonnell] for reporting the issue! DFSStripedInputStream does implement unbuffer as it derives from DFSInputStream which calls closeCurrentBlockReaders(); and that method clears all the block readers in DFSStripedInputStream, so that should be fine. I think the issue might be that either CryptoInputStream or HdfsDataInputStream does not call the correct unbuffer method. But I have to check that to be sure. > DFSStripedInputStream should implement unbuffer() > - > > Key: HDFS-14308 > URL: https://issues.apache.org/jira/browse/HDFS-14308 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Joe McDonnell >Priority: Major > Attachments: ec_heap_dump.png > > > Some users of HDFS cache opened HDFS file handles to avoid repeated > roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file > handles by default. Recent tests on erasure coded files show that the open > file handles can consume a large amount of memory when not in use. > For example, here is output from Impala's JMX endpoint when 608 file handles > are cached > {noformat} > { > "name": "java.nio:type=BufferPool,name=direct", > "modelerType": "sun.management.ManagementFactoryHelper$1", > "Name": "direct", > "TotalCapacity": 1921048960, > "MemoryUsed": 1921048961, > "Count": 633, > "ObjectName": "java.nio:type=BufferPool,name=direct" > },{noformat} > This shows direct buffer memory usage of 3MB per DFSStripedInputStream. > Attached is output from Eclipse MAT showing that the direct buffers come from > DFSStripedInputStream objects. > To support caching file handles on erasure coded files, DFSStripedInputStream > should implement the unbuffer() call. See HDFS-7694. "unbuffer()" is intended > to move an input stream to a lower memory state to support these caching use > cases. Both Impala and HBase call unbuffer() when a file handle is being > cached and potentially unused for significant chunks of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774294#comment-16774294 ] Kitti Nanasi commented on HDFS-14298: - Thanks [~surendrasingh] for the comment! I refactored the test in patch v004 not to use hardcoded policies. > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch, HDFS-14298.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.004.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch, HDFS-14298.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773156#comment-16773156 ] Kitti Nanasi commented on HDFS-14298: - Added patch v003 to fix the tests introduced by the latest trunk in TestECAdmin. > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.003.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772985#comment-16772985 ] Kitti Nanasi commented on HDFS-14298: - Thanks for the review [~shwetayakkali]! TestNameNodeMXBean failed because of the patch v001, so I fixed that in patch v002. The other test failures are not related. > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.002.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Status: Patch Available (was: Open) > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14298) Improve log messages of ECTopologyVerifier
Kitti Nanasi created HDFS-14298: --- Summary: Improve log messages of ECTopologyVerifier Key: HDFS-14298 URL: https://issues.apache.org/jira/browse/HDFS-14298 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kitti Nanasi Assignee: Kitti Nanasi -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.001.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771214#comment-16771214 ] Kitti Nanasi commented on HDFS-14188: - Thanks for the comment [~jojochuang]! Yes, that was a typo, I corrected it in patch v005. You are right, the usage in case of multiple policies is very confusing, I corrected the implementation, so you don't need quotes anymore. It should work like this in the new patch: {code:java} -verifyClusterSetup -policy RS-3-2-1024k RS-10-4-1024 {code} > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.005.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: (was: HDFS-14188.005.patch) > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.005.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch, HDFS-14188.005.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768128#comment-16768128 ] Kitti Nanasi commented on HDFS-14188: - Thanks for the comment [~shwetayakkali]! I fixed the checkstyle issues in patch v004 and merged the tests into one. > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.004.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch, HDFS-14188.004.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14125) Use parameterized log format in ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767354#comment-16767354 ] Kitti Nanasi commented on HDFS-14125: - Thanks [~shwetayakkali] for the review and [~jojochuang] for reviewing and committing! > Use parameterized log format in ECTopologyVerifier > -- > > Key: HDFS-14125 > URL: https://issues.apache.org/jira/browse/HDFS-14125 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.3.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Fix For: 3.3.0 > > Attachments: HDFS-14125.001.patch > > > ECTopologyVerifier introduced in > [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a > parameterized log format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error
[ https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767338#comment-16767338 ] Kitti Nanasi commented on HDFS-14231: - Thanks [~jojochuang] for reviewing and committing! > DataXceiver#run() should not log exceptions caused by InvalidToken exception > as an error > > > Key: HDFS-14231 > URL: https://issues.apache.org/jira/browse/HDFS-14231 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14231.001.patch > > > HDFS-10760 changed the log level from error to trace in DataXceiver#run() if > the exception was an InvalidToken exception. I think it would be beneficial > to log on trace level if the exception's cause was InvalidException. Like in > the following case: > {code:java} > DataXceiver error processing unknown operation > src: xxx dst: xxx > javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password > [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block > token with block_token_identifier > (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, > blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767210#comment-16767210 ] Kitti Nanasi commented on HDFS-14188: - Thanks for the review [~ayushtkn]! * It's a good idea to save efforts by running the verify for multiple policies at the same time, I implemented it in patch v003. * About the test, although it would same some execution time to merge the tests together, I think it would be less readable and I would have to clear the system out inside the method multiple times, instead of leaving it for the tearDown method, which would be a bit confusing. But I am not against merging them if you want to save on test execution time, so let me know your opinion on that. * Thanks for reminding me about the documentation, I extended it. > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.003.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch, > HDFS-14188.003.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14273) Fix checkstyle issues in BlockLocation's method javadoc
[ https://issues.apache.org/jira/browse/HDFS-14273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767064#comment-16767064 ] Kitti Nanasi commented on HDFS-14273: - Thanks for the patch [~shwetayakkali]! +1 (non-binding) > Fix checkstyle issues in BlockLocation's method javadoc > --- > > Key: HDFS-14273 > URL: https://issues.apache.org/jira/browse/HDFS-14273 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Shweta >Assignee: Shweta >Priority: Trivial > Attachments: HDFS-14273.001.patch > > > BlockLocation. java has checkstyle issues for most of methods's javadoc and > an indentation error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write
[ https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757827#comment-16757827 ] Kitti Nanasi commented on HDFS-14187: - Thanks for reviewing and committing it [~jojochuang]! > Make warning message more clear when there are not enough data nodes for EC > write > - > > Key: HDFS-14187 > URL: https://issues.apache.org/jira/browse/HDFS-14187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14187.001.patch > > > When setting an erasure coding policy for which there are not enough racks or > data nodes, write will fail with the following message: > {code:java} > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir > /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path > /user/systest/testdir > Set default erasure coding policy on /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 > /user/systest/testdir > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write > 2 blocks. It's at high risk of losing data. > {code} > I suggest to log a more descriptive message suggesting to use hdfs ec > -verifyCluster command to verify the cluster setup against the ec policies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14125) Use parameterized log format in ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755384#comment-16755384 ] Kitti Nanasi commented on HDFS-14125: - Thanks for the comment [~shwetayakkali]! I used String.format instead of the logger's standard formatter, which accepts %s and %d. I used that because I pass that same String in the result as well, so I can't use just the logger's formatter. > Use parameterized log format in ECTopologyVerifier > -- > > Key: HDFS-14125 > URL: https://issues.apache.org/jira/browse/HDFS-14125 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.3.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Attachments: HDFS-14125.001.patch > > > ECTopologyVerifier introduced in > [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a > parameterized log format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error
[ https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14231: Attachment: HDFS-14231.001.patch > DataXceiver#run() should not log exceptions caused by InvalidToken exception > as an error > > > Key: HDFS-14231 > URL: https://issues.apache.org/jira/browse/HDFS-14231 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14231.001.patch > > > HDFS-10760 changed the log level from error to trace in DataXceiver#run() if > the exception was an InvalidToken exception. I think it would be beneficial > to log on trace level if the exception's cause was InvalidException. Like in > the following case: > {code:java} > DataXceiver error processing unknown operation > src: xxx dst: xxx > javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password > [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block > token with block_token_identifier > (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, > blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error
[ https://issues.apache.org/jira/browse/HDFS-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14231: Status: Patch Available (was: Open) > DataXceiver#run() should not log exceptions caused by InvalidToken exception > as an error > > > Key: HDFS-14231 > URL: https://issues.apache.org/jira/browse/HDFS-14231 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14231.001.patch > > > HDFS-10760 changed the log level from error to trace in DataXceiver#run() if > the exception was an InvalidToken exception. I think it would be beneficial > to log on trace level if the exception's cause was InvalidException. Like in > the following case: > {code:java} > DataXceiver error processing unknown operation > src: xxx dst: xxx > javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password > [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block > token with block_token_identifier > (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, > blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14231) DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error
Kitti Nanasi created HDFS-14231: --- Summary: DataXceiver#run() should not log exceptions caused by InvalidToken exception as an error Key: HDFS-14231 URL: https://issues.apache.org/jira/browse/HDFS-14231 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Affects Versions: 3.1.1 Reporter: Kitti Nanasi Assignee: Kitti Nanasi HDFS-10760 changed the log level from error to trace in DataXceiver#run() if the exception was an InvalidToken exception. I think it would be beneficial to log on trace level if the exception's cause was InvalidException. Like in the following case: {code:java} DataXceiver error processing unknown operation src: xxx dst: xxx javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with block_token_identifier (expiryDate=1547593336220, keyId=-1735471718, userId=hbase, blockPoolId=BP-xxx, blockId=1245599303, access modes=[READ]) is expired.] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.002.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch, HDFS-14188.002.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14125) Use parameterized log format in ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14125: Attachment: HDFS-14125.001.patch > Use parameterized log format in ECTopologyVerifier > -- > > Key: HDFS-14125 > URL: https://issues.apache.org/jira/browse/HDFS-14125 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.3.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Attachments: HDFS-14125.001.patch > > > ECTopologyVerifier introduced in > [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a > parameterized log format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14125) Use parameterized log format in ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14125: Status: Patch Available (was: Open) > Use parameterized log format in ECTopologyVerifier > -- > > Key: HDFS-14125 > URL: https://issues.apache.org/jira/browse/HDFS-14125 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.3.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Trivial > Attachments: HDFS-14125.001.patch > > > ECTopologyVerifier introduced in > [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a > parameterized log format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Status: Patch Available (was: Open) > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
[ https://issues.apache.org/jira/browse/HDFS-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14188: Attachment: HDFS-14188.001.patch > Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a > parameter > --- > > Key: HDFS-14188 > URL: https://issues.apache.org/jira/browse/HDFS-14188 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14188.001.patch > > > hdfs ec -verifyClusterSetup command verifies if there are enough data nodes > and racks for the enabled erasure coding policies > I think it would be beneficial if it could accept an erasure coding policy as > a parameter optionally. For example the following command would run the > verify for only the RS-6-3-1024k policy. > {code:java} > hdfs ec -verifyClusterSetup -policy RS-6-3-1024k > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write
[ https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14187: Status: Patch Available (was: Open) > Make warning message more clear when there are not enough data nodes for EC > write > - > > Key: HDFS-14187 > URL: https://issues.apache.org/jira/browse/HDFS-14187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14187.001.patch > > > When setting an erasure coding policy for which there are not enough racks or > data nodes, write will fail with the following message: > {code:java} > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir > /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path > /user/systest/testdir > Set default erasure coding policy on /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 > /user/systest/testdir > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write > 2 blocks. It's at high risk of losing data. > {code} > I suggest to log a more descriptive message suggesting to use hdfs ec > -verifyCluster command to verify the cluster setup against the ec policies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write
[ https://issues.apache.org/jira/browse/HDFS-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14187: Attachment: HDFS-14187.001.patch > Make warning message more clear when there are not enough data nodes for EC > write > - > > Key: HDFS-14187 > URL: https://issues.apache.org/jira/browse/HDFS-14187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14187.001.patch > > > When setting an erasure coding policy for which there are not enough racks or > data nodes, write will fail with the following message: > {code:java} > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir > /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path > /user/systest/testdir > Set default erasure coding policy on /user/systest/testdir > [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 > /user/systest/testdir > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity > block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write > 2 blocks. It's at high risk of losing data. > {code} > I suggest to log a more descriptive message suggesting to use hdfs ec > -verifyCluster command to verify the cluster setup against the ec policies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741918#comment-16741918 ] Kitti Nanasi commented on HDFS-14061: - Thanks for the review [~jojochuang]! I will clean up the log messages in HDFS-14125, so for now I will leave them as is. I addressed your comments in the newest patch. About using all DataNodes in the verification, we did not consider in -HDFS-12946- to use only a subset of all DataNodes, that would make the verification a bit stricter and I am fine with that as well, though it would not make much difference in my opinion. Let me know if you have any suggestions on that. > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch, HDFS-14061.004.patch, HDFS-14061.005.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Attachment: HDFS-14061.005.patch > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch, HDFS-14061.004.patch, HDFS-14061.005.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740232#comment-16740232 ] Kitti Nanasi commented on HDFS-14134: - Thanks [~lukmajercak] for the work here! {quote}Also note that previously, if a hedging request got FAILOVER_RETRY and some request got SocketExc on nonidempotent operation (e.g. FAIL), the client would still pick FAILOVER_RETRY over FAIL, so i think we are fixing an issue here as well. {quote} Sounds good that you found and fixed this issue as well. +1 (non-binding) New patch looks good to me! > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134.006.patch, HDFS-14134.007.patch, > HDFS-14134_retrypolicy_change_proposal.pdf, > HDFS-14134_retrypolicy_change_proposal_1.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738031#comment-16738031 ] Kitti Nanasi commented on HDFS-14061: - Thanks [~shwetayakkali] for the comment! I added messages for asserts in patch 004. > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch, HDFS-14061.004.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Attachment: HDFS-14061.004.patch > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch, HDFS-14061.004.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737125#comment-16737125 ] Kitti Nanasi commented on HDFS-14061: - Thanks for the comment [~adam.antal]! I fixed the renames and added a new test. The new message is already tested in TestECAdmin, so I don't think it would add more value to also test it in TestErasureCodingCLI, and the number of racks and data nodes are more difficult to configure there. You are right, System.err in TestECAdmin.java was only used by patch v001, but I think it is worth to keep that check, because ECAdmin can write to System.err. > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Attachment: HDFS-14061.003.patch > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch, > HDFS-14061.003.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14188) Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter
Kitti Nanasi created HDFS-14188: --- Summary: Make hdfs ec -verifyClusterSetup command accept an erasure coding policy as a parameter Key: HDFS-14188 URL: https://issues.apache.org/jira/browse/HDFS-14188 Project: Hadoop HDFS Issue Type: Improvement Components: erasure-coding Affects Versions: 3.1.1 Reporter: Kitti Nanasi Assignee: Kitti Nanasi hdfs ec -verifyClusterSetup command verifies if there are enough data nodes and racks for the enabled erasure coding policies I think it would be beneficial if it could accept an erasure coding policy as a parameter optionally. For example the following command would run the verify for only the RS-6-3-1024k policy. {code:java} hdfs ec -verifyClusterSetup -policy RS-6-3-1024k {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735969#comment-16735969 ] Kitti Nanasi commented on HDFS-14061: - Thanks for the comment [~ayushtkn]! You are right, failing the policy setting might be too harsh. Why I wanted to do that was because when setting the policy and writing to the folder, the error message we get is quite misleading. But I think if that message is corrected, it will be fine, so I raised [HDFS-14187|https://issues.apache.org/jira/browse/HDFS-14187] for that. And in patch 002 I fixed your comments. > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Status: Patch Available (was: Open) > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Attachment: HDFS-14061.002.patch > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch, HDFS-14061.002.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14187) Make warning message more clear when there are not enough data nodes for EC write
Kitti Nanasi created HDFS-14187: --- Summary: Make warning message more clear when there are not enough data nodes for EC write Key: HDFS-14187 URL: https://issues.apache.org/jira/browse/HDFS-14187 Project: Hadoop HDFS Issue Type: Improvement Components: erasure-coding Affects Versions: 3.1.1 Reporter: Kitti Nanasi Assignee: Kitti Nanasi When setting an erasure coding policy for which there are not enough racks or data nodes, write will fail with the following message: {code:java} [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -mkdir /user/systest/testdir [root@oks-upgrade6727-1 ~]# sudo -u hdfs hdfs ec -setPolicy -path /user/systest/testdir Set default erasure coding policy on /user/systest/testdir [root@oks-upgrade6727-1 ~]# sudo -u systest hdfs dfs -put /tmp/file1 /user/systest/testdir 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity block(index=3, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Cannot allocate parity block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] 18/11/12 05:41:26 WARN hdfs.DFSOutputStream: Block group <1> failed to write 2 blocks. It's at high risk of losing data. {code} I suggest to log a more descriptive message suggesting to use hdfs ec -verifyCluster command to verify the cluster setup against the ec policies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734313#comment-16734313 ] Kitti Nanasi commented on HDFS-14061: - Patch 001 contains: * A warning is shown if the cluster topology verify fails for all enabled policies, when enabling a policy. * Policy setting fails if the cluster topology verification fails for the policy There could be one concern with the second one, if we want to set RS(6,3) policy, the verify will only succeed if there are at least 9 data nodes which may be a bit strict. Should we allow setting the policy with less data nodes than 9? I would like to hear some opinions about that. > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14061) Check if the cluster topology supports the EC policy before setting, enabling or adding it
[ https://issues.apache.org/jira/browse/HDFS-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14061: Attachment: HDFS-14061.001.patch > Check if the cluster topology supports the EC policy before setting, enabling > or adding it > -- > > Key: HDFS-14061 > URL: https://issues.apache.org/jira/browse/HDFS-14061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, hdfs >Affects Versions: 3.1.1 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-14061.001.patch > > > HDFS-12946 introduces a command for verifying if there are enough racks and > datanodes for the enabled erasure coding policies. > This verification could be executed for the erasure coding policy before > enabling, setting or adding it and a warning message could be written if the > verify fails, or the policy setting could be even failed in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.
[ https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726756#comment-16726756 ] Kitti Nanasi commented on HDFS-13965: - [~lokeskumarp], I think it is possible to fix, but it is not a trivial change, so until it is fixed you can work around this problem by setting the KRB5CCNAME environment variable to the path of the ticket cache. > hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS > encryption is enabled. > - > > Key: HDFS-13965 > URL: https://issues.apache.org/jira/browse/HDFS-13965 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, kms >Affects Versions: 2.7.3, 2.7.7 >Reporter: LOKESKUMAR VIJAYAKUMAR >Assignee: Kitti Nanasi >Priority: Major > > _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide > a custom kerberos cache path for all hadoop operations to be run as specified > user. But this setting is not honored when KMS encryption is enabled._ > _The below program to read a file works when KMS encryption is not enabled, > but it fails when the KMS encryption is enabled._ > _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not > honored by *createConnection on KMSClientProvider.java.*_ > > HadoopTest.java (CLASSPATH needs to be set to compile and run) > > import java.io.InputStream; > import java.net.URI; > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > > public class HadoopTest { > public static int runRead(String[] args) throws Exception{ > if (args.length < 3) { > System.err.println("HadoopTest hadoop_file_path > hadoop_user kerberos_cache"); > return 1; > } > Path inputPath = new Path(args[0]); > Configuration conf = new Configuration(); > URI defaultURI = FileSystem.getDefaultUri(conf); > > conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]); > FileSystem fs = > FileSystem.newInstance(defaultURI,conf,args[1]); > InputStream is = fs.open(inputPath); > byte[] buffer = new byte[4096]; > int nr = is.read(buffer); > while (nr != -1) > { > System.out.write(buffer, 0, nr); > nr = is.read(buffer); > } > return 0; > } > public static void main( String[] args ) throws Exception { > int returnCode = HadoopTest.runRead(args); > System.exit(returnCode); > } > } > > > > [root@lstrost3 testhadoop]# pwd > /testhadoop > > [root@lstrost3 testhadoop]# ls > HadoopTest.java > > [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:. > > [root@lstrost3 testhadoop]# javac HadoopTest.java > > [root@lstrost3 testhadoop]# java HadoopTest > HadoopTest hadoop_file_path hadoop_user kerberos_cache > > [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki > /tmp/krb5cc_1006 > 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit > local reads feature cannot be used because libhadoop cannot be loaded. > Exception in thread "main" java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: *{color:#FF}No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt){color}* > at > {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color} > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340) > at
[jira] [Commented] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.
[ https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723104#comment-16723104 ] Kitti Nanasi commented on HDFS-13965: - [~jojochuang], you are correct, the problem is that KerberosConfiguration does not use the ticket cache set in the configuration. A workaround for this is that you can set the "KRB5CCNAME" environment variable to the ticket cache path. However the root user using the ticket cache of another user's to read its encryption zone does not seem like a usual scenario to me. You might want to consider running your script in an oozie workflow, which can run your script in the name of the other user using delegation tokens. [~lokeskumarp], let me know if you have questions. > hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS > encryption is enabled. > - > > Key: HDFS-13965 > URL: https://issues.apache.org/jira/browse/HDFS-13965 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, kms >Affects Versions: 2.7.3, 2.7.7 >Reporter: LOKESKUMAR VIJAYAKUMAR >Assignee: Kitti Nanasi >Priority: Major > > _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide > a custom kerberos cache path for all hadoop operations to be run as specified > user. But this setting is not honored when KMS encryption is enabled._ > _The below program to read a file works when KMS encryption is not enabled, > but it fails when the KMS encryption is enabled._ > _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not > honored by *createConnection on KMSClientProvider.java.*_ > > HadoopTest.java (CLASSPATH needs to be set to compile and run) > > import java.io.InputStream; > import java.net.URI; > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > > public class HadoopTest { > public static int runRead(String[] args) throws Exception{ > if (args.length < 3) { > System.err.println("HadoopTest hadoop_file_path > hadoop_user kerberos_cache"); > return 1; > } > Path inputPath = new Path(args[0]); > Configuration conf = new Configuration(); > URI defaultURI = FileSystem.getDefaultUri(conf); > > conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]); > FileSystem fs = > FileSystem.newInstance(defaultURI,conf,args[1]); > InputStream is = fs.open(inputPath); > byte[] buffer = new byte[4096]; > int nr = is.read(buffer); > while (nr != -1) > { > System.out.write(buffer, 0, nr); > nr = is.read(buffer); > } > return 0; > } > public static void main( String[] args ) throws Exception { > int returnCode = HadoopTest.runRead(args); > System.exit(returnCode); > } > } > > > > [root@lstrost3 testhadoop]# pwd > /testhadoop > > [root@lstrost3 testhadoop]# ls > HadoopTest.java > > [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:. > > [root@lstrost3 testhadoop]# javac HadoopTest.java > > [root@lstrost3 testhadoop]# java HadoopTest > HadoopTest hadoop_file_path hadoop_user kerberos_cache > > [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki > /tmp/krb5cc_1006 > 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit > local reads feature cannot be used because libhadoop cannot be loaded. > Exception in thread "main" java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: *{color:#FF}No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt){color}* > at > {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color} > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463) > at >
[jira] [Commented] (HDFS-14132) Add BlockLocation.isStriped() to determine if block is replicated or Striped
[ https://issues.apache.org/jira/browse/HDFS-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721083#comment-16721083 ] Kitti Nanasi commented on HDFS-14132: - Thanks [~shwetayakkali] for the patch! Looks good to me. +1 (non-binding) > Add BlockLocation.isStriped() to determine if block is replicated or Striped > > > Key: HDFS-14132 > URL: https://issues.apache.org/jira/browse/HDFS-14132 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14132.001.patch > > > Impala uses FileSystem#getBlockLocation to get block locations. We can add > isStriped() method for it to easier determine the block is belonged to > replicated file or striped file. > In HDFS, this isStriped information is already in > HdfsBlockLocation#LocatedBlock#isStriped(), adding this method to > BlockLocation does not introduce space overhead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14132) Add BlockLocation.isStriped() to determine if block is replicated or Striped
[ https://issues.apache.org/jira/browse/HDFS-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721083#comment-16721083 ] Kitti Nanasi edited comment on HDFS-14132 at 12/14/18 9:12 AM: --- Thanks [~shwetayakkali] for the patch! Looks good to me and the test failures do not seem related. +1 (non-binding) was (Author: knanasi): Thanks [~shwetayakkali] for the patch! Looks good to me. +1 (non-binding) > Add BlockLocation.isStriped() to determine if block is replicated or Striped > > > Key: HDFS-14132 > URL: https://issues.apache.org/jira/browse/HDFS-14132 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14132.001.patch > > > Impala uses FileSystem#getBlockLocation to get block locations. We can add > isStriped() method for it to easier determine the block is belonged to > replicated file or striped file. > In HDFS, this isStriped information is already in > HdfsBlockLocation#LocatedBlock#isStriped(), adding this method to > BlockLocation does not introduce space overhead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720572#comment-16720572 ] Kitti Nanasi commented on HDFS-14134: - The relevant part is the following: {quote}in FailoverOnNetworkExceptionRetry#shouldRetry we don't fail-over and retry if we're making a non-idempotent call and there's an IOException or SocketException that's not Connect, NoRouteToHost, UnknownHost, or Standby. The rationale of course is that the operation may have reached the server and retrying elsewhere could leave us in an insconsistent state. This means if a client doing a create/delete which gets a SocketTimeoutException (which is an IOE) or an EOF SocketException the exception will be thrown all the way up to the caller of FileSystem/FileContext. That's reasonable because only the user of the API at this level has sufficient knoweldge of how to handle the failure, eg if they get such an exception after issuing a delete they can check if the file still exists and if so re-issue the delete (however they may also not want to do this, and FileContext doesn't know which). {quote} > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720563#comment-16720563 ] Kitti Nanasi commented on HDFS-14134: - Yes, this change covers that, I just wanted to understand why you changed it like that, but we're pretty much on the same page now. I have only one concern, which is the case of non-remote IOExceptions on non-idempotent operations, I'm not sure if retrying those will cause any problems. For reference there is a discussion on [HADOOP-7380|https://issues.apache.org/jira/browse/HADOOP-7380] on why it was introduced. Other than that patch v5 looks good. > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14121) Log message about the old hosts file format is misleading
[ https://issues.apache.org/jira/browse/HDFS-14121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720266#comment-16720266 ] Kitti Nanasi commented on HDFS-14121: - Thanks [~zvenczel] for the new patch! It looks good to me. +1 (non-binding) > Log message about the old hosts file format is misleading > - > > Key: HDFS-14121 > URL: https://issues.apache.org/jira/browse/HDFS-14121 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-14121.01.patch, HDFS-14121.02.patch > > > In {{CombinedHostsFileReader.readFile()}} we have the following: > {code} LOG.warn("{} has invalid JSON format." + > "Try the old format without top-level token defined.", > hostsFile);{code} > That message is trying to say that we tried parsing the hosts file as a > well-formed JSON file and failed, so we're going to try again assuming that > it's in the old badly-formed format. What it actually says is that the hosts > fie is bad, and the admin should try switching to the old format. Those are > two very different things. > While were in there, we should refactor the logging so that instead of > reporting that we're going to try using a different parser (who the heck > cares?), we report that the we had to use the old parser to successfully > parse the hosts file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720204#comment-16720204 ] Kitti Nanasi commented on HDFS-14134: - I totally agree with you that retrying getXAttr on "attr could not find" IOException is not good and wasteful, and that we have to have a better concept than the current. But we also have to keep in mind that the FailoverOnNetworkExceptionRetry policy is used by many parts of the code and it is a bit risky to change it. I think the idea behind the previous design is that non remote IOExceptions may be network related exceptions, so it is worth to retry them if the operation is idempotent. > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718752#comment-16718752 ] Kitti Nanasi commented on HDFS-14134: - [~lukmajercak], you are correct on the definition of idempotency. I think the original approach in retrying was that idempotent operations don't change internal state, so it is safe to retry them. For example if you just get a value, it is always safe to retry, but if you renew a delegation token, it is a more complex question if it is safe to retry that or not, because maybe the renewal already took place before failing, maybe not, and if it already took place, is it safe to renew it again? By the way idempotency originally was only considered in case of non-remote IOExceptions, why did that change? > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716690#comment-16716690 ] Kitti Nanasi commented on HDFS-14134: - Thanks for the new patch [~lukmajercak]! It looks better regarding retrying on non-remote IOExceptions, but there is one thing I don't understand, which I think is wrong in the pdf as well. In case of remote IOException, we should retry if the operation is idempotent, and not the opposite. So instead of this code: {code:java} else if (e instanceof IOException) { if (e instanceof RemoteException && isIdempotentOrAtMostOnce) { return new RetryAction(RetryAction.RetryDecision.FAIL, 0, "Remote exception and the invoked method is idempotent " + "or at most once."); } return new RetryAction(RetryAction.RetryDecision.FAILOVER_AND_RETRY, getFailoverOrRetrySleepTime(failovers)); } {code} I think it should look like this: {code:java} else if (e instanceof IOException) { if (e instanceof RemoteException && !isIdempotentOrAtMostOnce) { return new RetryAction(RetryAction.RetryDecision.FAIL, 0, "Remote exception and the invoked method is idempotent " + "or at most once."); } return new RetryAction(RetryAction.RetryDecision.FAILOVER_AND_RETRY, getFailoverOrRetrySleepTime(failovers)); } {code} What do you think [~lukmajercak]? > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, > HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client
[ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714464#comment-16714464 ] Kitti Nanasi commented on HDFS-14134: - Thanks [~lukmajercak] for the patch! The proposed solution in the pdf seems good to me, but looking at the code, the retry does not happen on non-remote IOExceptions at all, which is not the same behaviour as described in the pdf. Also TestLoadBalancingKMSClientProvider#testClientRetriesIdempotentOpWithIOExceptionSucceedsSecondTime fails because of that. > Idempotent operations throwing RemoteException should not be retried by the > client > -- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-14134.001.patch, > HDFS-14134_retrypolicy_change_proposal.pdf > > > Currently, some operations that throw IOException on the NameNode are > evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail > fast. > For example, when calling getXAttr("user.some_attr", file") where the file > does not have the attribute, NN throws an IOException with message "could not > find attr". The current client retry policy determines the action for that to > be FAILOVER_AND_RETRY. The client then fails over and retries until it > reaches the maximum number of retries. Supposedly, the client should be able > to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at > all the retry actions from all requests, and FAILOVER_AND_RETRY takes > precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14121) Log message about the old hosts file format is misleading
[ https://issues.apache.org/jira/browse/HDFS-14121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712817#comment-16712817 ] Kitti Nanasi commented on HDFS-14121: - Thanks [~zvenczel] for the patch! The patch overall looks good to me. I only have one minor comment, that the warning message about the empty file content could be more descriptive. > Log message about the old hosts file format is misleading > - > > Key: HDFS-14121 > URL: https://issues.apache.org/jira/browse/HDFS-14121 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-14121.01.patch > > > In {{CombinedHostsFileReader.readFile()}} we have the following: > {code} LOG.warn("{} has invalid JSON format." + > "Try the old format without top-level token defined.", > hostsFile);{code} > That message is trying to say that we tried parsing the hosts file as a > well-formed JSON file and failed, so we're going to try again assuming that > it's in the old badly-formed format. What it actually says is that the hosts > fie is bad, and the admin should try switching to the old format. Those are > two very different things. > While were in there, we should refactor the logging so that instead of > reporting that we're going to try using a different parser (who the heck > cares?), we report that the we had to use the old parser to successfully > parse the hosts file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-13965) hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS encryption is enabled.
[ https://issues.apache.org/jira/browse/HDFS-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi reassigned HDFS-13965: --- Assignee: Kitti Nanasi > hadoop.security.kerberos.ticket.cache.path setting is not honored when KMS > encryption is enabled. > - > > Key: HDFS-13965 > URL: https://issues.apache.org/jira/browse/HDFS-13965 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, kms >Affects Versions: 2.7.3, 2.7.7 >Reporter: LOKESKUMAR VIJAYAKUMAR >Assignee: Kitti Nanasi >Priority: Major > > _We use the *+hadoop.security.kerberos.ticket.cache.path+* setting to provide > a custom kerberos cache path for all hadoop operations to be run as specified > user. But this setting is not honored when KMS encryption is enabled._ > _The below program to read a file works when KMS encryption is not enabled, > but it fails when the KMS encryption is enabled._ > _Looks like *hadoop.security.kerberos.ticket.cache.path* setting is not > honored by *createConnection on KMSClientProvider.java.*_ > > HadoopTest.java (CLASSPATH needs to be set to compile and run) > > import java.io.InputStream; > import java.net.URI; > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > > public class HadoopTest { > public static int runRead(String[] args) throws Exception{ > if (args.length < 3) { > System.err.println("HadoopTest hadoop_file_path > hadoop_user kerberos_cache"); > return 1; > } > Path inputPath = new Path(args[0]); > Configuration conf = new Configuration(); > URI defaultURI = FileSystem.getDefaultUri(conf); > > conf.set("hadoop.security.kerberos.ticket.cache.path",args[2]); > FileSystem fs = > FileSystem.newInstance(defaultURI,conf,args[1]); > InputStream is = fs.open(inputPath); > byte[] buffer = new byte[4096]; > int nr = is.read(buffer); > while (nr != -1) > { > System.out.write(buffer, 0, nr); > nr = is.read(buffer); > } > return 0; > } > public static void main( String[] args ) throws Exception { > int returnCode = HadoopTest.runRead(args); > System.exit(returnCode); > } > } > > > > [root@lstrost3 testhadoop]# pwd > /testhadoop > > [root@lstrost3 testhadoop]# ls > HadoopTest.java > > [root@lstrost3 testhadoop]# export CLASSPATH=`hadoop classpath --glob`:. > > [root@lstrost3 testhadoop]# javac HadoopTest.java > > [root@lstrost3 testhadoop]# java HadoopTest > HadoopTest hadoop_file_path hadoop_user kerberos_cache > > [root@lstrost3 testhadoop]# java HadoopTest /loki/loki.file loki > /tmp/krb5cc_1006 > 18/09/27 23:23:20 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/09/27 23:23:21 WARN shortcircuit.DomainSocketFactory: The short-circuit > local reads feature cannot be used because libhadoop cannot be loaded. > Exception in thread "main" java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: *{color:#FF}No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt){color}* > at > {color:#FF}*org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:551)*{color} > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:831) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786) > at HadoopTest.runRead(HadoopTest.java:18) > at HadoopTest.main(HadoopTest.java:29) > Caused by: >
[jira] [Commented] (HDFS-12946) Add a tool to check rack configuration against EC policies
[ https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708567#comment-16708567 ] Kitti Nanasi commented on HDFS-12946: - Thanks [~jojochuang] for reviewing and committing! I created HDFS-14125 to change the logs to use parameterized log format. > Add a tool to check rack configuration against EC policies > -- > > Key: HDFS-12946 > URL: https://issues.apache.org/jira/browse/HDFS-12946 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Kitti Nanasi >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, > HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, > HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, > HDFS-12946.09.patch, HDFS-12946.10.patch, HDFS-12946.11.patch, > HDFS-12946.12.patch > > > From testing we have seen setups with problematic racks / datanodes that > would not suffice basic EC usages. These are usually found out only after the > tests failed. > We should provide a way to check this beforehand. > Some scenarios: > - not enough datanodes compared to EC policy's highest data+parity number > - not enough racks to satisfy BPPRackFaultTolerant > - highly uneven racks to satisfy BPPRackFaultTolerant > - highly uneven racks (so that BPP's considerLoad logic may exclude some busy > nodes on the rack, resulting in #2) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14125) Use parameterized log format in ECTopologyVerifier
Kitti Nanasi created HDFS-14125: --- Summary: Use parameterized log format in ECTopologyVerifier Key: HDFS-14125 URL: https://issues.apache.org/jira/browse/HDFS-14125 Project: Hadoop HDFS Issue Type: Improvement Components: erasure-coding Affects Versions: 3.3.0 Reporter: Kitti Nanasi Assignee: Kitti Nanasi ECTopologyVerifier introduced in [HDFS-12946|https://issues.apache.org/jira/browse/HDFS-12946] should use a parameterized log format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies
[ https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706913#comment-16706913 ] Kitti Nanasi edited comment on HDFS-14113 at 12/3/18 9:56 AM: -- Thanks [~ayushtkn] for the patch! The patch overall looks good to me. There are only some checkstyle issues, and I think the following assert in TestErasureCodingAddConfig has an empty message by mistake. {code:java} assertNull("", response[0].getErrorMsg()); {code} was (Author: knanasi): Thanks [~ayushtkn] for the patch! The patch overall looks good to me. There are only some checkstyle issues, and I think the following assert has an empty message by mistake. {code:java} assertNull("", response[0].getErrorMsg()); {code} > EC : Add Configuration to restrict UserDefined Policies > --- > > Key: HDFS-14113 > URL: https://issues.apache.org/jira/browse/HDFS-14113 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14113-01.patch > > > By default addition of erasure coding policies is enabled for users.We need > to add configuration whether to allow addition of new User Defined policies > or not.Which can be configured in for of a Boolean value at the server side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14113) EC : Add Configuration to restrict UserDefined Policies
[ https://issues.apache.org/jira/browse/HDFS-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706913#comment-16706913 ] Kitti Nanasi commented on HDFS-14113: - Thanks [~ayushtkn] for the patch! The patch overall looks good to me. There are only some checkstyle issues, and I think the following assert has an empty message by mistake. {code:java} assertNull("", response[0].getErrorMsg()); {code} > EC : Add Configuration to restrict UserDefined Policies > --- > > Key: HDFS-14113 > URL: https://issues.apache.org/jira/browse/HDFS-14113 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14113-01.patch > > > By default addition of erasure coding policies is enabled for users.We need > to add configuration whether to allow addition of new User Defined policies > or not.Which can be configured in for of a Boolean value at the server side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE
[ https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706834#comment-16706834 ] Kitti Nanasi commented on HDFS-14081: - Thanks [~shwetayakkali] for the new patch! +1 (non-binding) > hdfs dfsadmin -metasave metasave_test results NPE > - > > Key: HDFS-14081 > URL: https://issues.apache.org/jira/browse/HDFS-14081 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.1 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, > HDFS-14081.003.patch, HDFS-14081.004.patch > > > Race condition is encountered while adding Block to > postponedMisreplicatedBlocks which in turn tried to retrieve Block from > BlockManager in which it may not be present. > This happens in HA, metasave in first NN succeeded but failed in second NN, > StackTrace showing NPE is as follows: > {code} > 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:60234java.lang.NullPointerException at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE
[ https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704762#comment-16704762 ] Kitti Nanasi commented on HDFS-14081: - Thanks for the new patch [~shwetayakkali]! You can use the println method instead of printing "\n" separately at the end, but it's just a really minor issue. Looks good other than that. > hdfs dfsadmin -metasave metasave_test results NPE > - > > Key: HDFS-14081 > URL: https://issues.apache.org/jira/browse/HDFS-14081 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.1 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, > HDFS-14081.003.patch > > > Race condition is encountered while adding Block to > postponedMisreplicatedBlocks which in turn tried to retrieve Block from > BlockManager in which it may not be present. > This happens in HA, metasave in first NN succeeded but failed in second NN, > StackTrace showing NPE is as follows: > {code} > 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:60234java.lang.NullPointerException at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14115) TestNamenodeCapacityReport#testXceiverCount is flaky
Kitti Nanasi created HDFS-14115: --- Summary: TestNamenodeCapacityReport#testXceiverCount is flaky Key: HDFS-14115 URL: https://issues.apache.org/jira/browse/HDFS-14115 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.1.1 Reporter: Kitti Nanasi Assignee: Kitti Nanasi TestNamenodeCapacityReport#testXceiverCount sometimes fails with the following error: {code} 2018-11-28 17:33:45,816 INFO DataNode - PacketResponder: BP-645736292-172.17.0.2-1543426416580:blk_1073741828_1004, type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=2:[127.0.0.1:37115, 127.0.0.1:35107] terminating 2018-11-28 17:33:45,817 INFO StateChange - DIR* completeFile: /f3 is closed by DFSClient_NONMAPREDUCE_1933849415_1 2018-11-28 17:33:45,817 INFO ExitUtil - Exiting with status 1: Block report processor encountered fatal exception: java.lang.AssertionError: Negative replicas! 2018-11-28 17:33:45,818 ERROR ExitUtil - Terminate called 1: Block report processor encountered fatal exception: java.lang.AssertionError: Negative replicas! at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807) Exception in thread "Block report processor" 1: Block report processor encountered fatal exception: java.lang.AssertionError: Negative replicas! at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4807) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE
[ https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701670#comment-16701670 ] Kitti Nanasi commented on HDFS-14081: - Thanks [~shwetayakkali] for the patch! The change looks good to me, printing a log is definitely better than throwing a NPE. I just have one minor comment, the two print statements could be merges as one, like this: {code:java} out.println("Block "+ block + " is Null"); {code} +1 (non-binding) pending on that > hdfs dfsadmin -metasave metasave_test results NPE > - > > Key: HDFS-14081 > URL: https://issues.apache.org/jira/browse/HDFS-14081 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.1 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch > > > Race condition is encountered while adding Block to > postponedMisreplicatedBlocks which in turn tried to retrieve Block from > BlockManager in which it may not be present. > This happens in HA, metasave in first NN succeeded but failed in second NN, > StackTrace showing NPE is as follows: > {code} > 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:60234java.lang.NullPointerException at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies
[ https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-12946: Attachment: HDFS-12946.11.patch > Add a tool to check rack configuration against EC policies > -- > > Key: HDFS-12946 > URL: https://issues.apache.org/jira/browse/HDFS-12946 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, > HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, > HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, > HDFS-12946.09.patch, HDFS-12946.10.patch, HDFS-12946.11.patch > > > From testing we have seen setups with problematic racks / datanodes that > would not suffice basic EC usages. These are usually found out only after the > tests failed. > We should provide a way to check this beforehand. > Some scenarios: > - not enough datanodes compared to EC policy's highest data+parity number > - not enough racks to satisfy BPPRackFaultTolerant > - highly uneven racks to satisfy BPPRackFaultTolerant > - highly uneven racks (so that BPP's considerLoad logic may exclude some busy > nodes on the rack, resulting in #2) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies
[ https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-12946: Attachment: HDFS-12946.10.patch > Add a tool to check rack configuration against EC policies > -- > > Key: HDFS-12946 > URL: https://issues.apache.org/jira/browse/HDFS-12946 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, > HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, > HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, > HDFS-12946.09.patch, HDFS-12946.10.patch > > > From testing we have seen setups with problematic racks / datanodes that > would not suffice basic EC usages. These are usually found out only after the > tests failed. > We should provide a way to check this beforehand. > Some scenarios: > - not enough datanodes compared to EC policy's highest data+parity number > - not enough racks to satisfy BPPRackFaultTolerant > - highly uneven racks to satisfy BPPRackFaultTolerant > - highly uneven racks (so that BPP's considerLoad logic may exclude some busy > nodes on the rack, resulting in #2) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12946) Add a tool to check rack configuration against EC policies
[ https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700402#comment-16700402 ] Kitti Nanasi commented on HDFS-12946: - Thanks [~xiaochen] for the comments! In patch v009 I fixed the comments and modified FSNamesystem#getVerifyECWithTopologyResult's return type to String to match the format of the other entries in the name node jmx. I created HDFS-14061 for running the topology check in FSN#enableErasureCodingPolicy. > Add a tool to check rack configuration against EC policies > -- > > Key: HDFS-12946 > URL: https://issues.apache.org/jira/browse/HDFS-12946 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, > HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, > HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, > HDFS-12946.09.patch > > > From testing we have seen setups with problematic racks / datanodes that > would not suffice basic EC usages. These are usually found out only after the > tests failed. > We should provide a way to check this beforehand. > Some scenarios: > - not enough datanodes compared to EC policy's highest data+parity number > - not enough racks to satisfy BPPRackFaultTolerant > - highly uneven racks to satisfy BPPRackFaultTolerant > - highly uneven racks (so that BPP's considerLoad logic may exclude some busy > nodes on the rack, resulting in #2) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12946) Add a tool to check rack configuration against EC policies
[ https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-12946: Attachment: HDFS-12946.09.patch > Add a tool to check rack configuration against EC policies > -- > > Key: HDFS-12946 > URL: https://issues.apache.org/jira/browse/HDFS-12946 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, > HDFS-12946.03.patch, HDFS-12946.04.fsck.patch, HDFS-12946.05.patch, > HDFS-12946.06.patch, HDFS-12946.07.patch, HDFS-12946.08.patch, > HDFS-12946.09.patch > > > From testing we have seen setups with problematic racks / datanodes that > would not suffice basic EC usages. These are usually found out only after the > tests failed. > We should provide a way to check this beforehand. > Some scenarios: > - not enough datanodes compared to EC policy's highest data+parity number > - not enough racks to satisfy BPPRackFaultTolerant > - highly uneven racks to satisfy BPPRackFaultTolerant > - highly uneven racks (so that BPP's considerLoad logic may exclude some busy > nodes on the rack, resulting in #2) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14050) Use parameterized logging construct in NamenodeFsck class
[ https://issues.apache.org/jira/browse/HDFS-14050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693002#comment-16693002 ] Kitti Nanasi commented on HDFS-14050: - Thanks [~hgadre] for the new patch! +1 (non binding) pending on the checkstyle issues. > Use parameterized logging construct in NamenodeFsck class > - > > Key: HDFS-14050 > URL: https://issues.apache.org/jira/browse/HDFS-14050 > Project: Hadoop HDFS > Issue Type: Task >Affects Versions: 3.0.0 >Reporter: Hrishikesh Gadre >Assignee: Hrishikesh Gadre >Priority: Trivial > Attachments: HDFS-14050-001.patch, HDFS-14050-002.patch, > HDFS-14050-003.patch, HDFS-14050-004.patch > > > HDFS-13695 implemented a change to use slf4j logger (instead of commons > logging). But the NamenodeFsck class is still not using parameterized logging > construct. This came up during the code review for HADOOP-11391. We should > change logging statements in NamenodeFsck to use slf4j style parameterized > logging apis. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
[ https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687873#comment-16687873 ] Kitti Nanasi commented on HDFS-14054: - Thanks [~zvenczel] for the patch and [~elgoiri] for the review! The change looks good to me, too. +1 (non binding) > TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and > testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky > > > Key: HDFS-14054 > URL: https://issues.apache.org/jira/browse/HDFS-14054 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Labels: flaky-test > Attachments: HDFS-14054.01.patch > > > --- > T E S T S > --- > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support > was removed in 8.0 > Running org.apache.hadoop.hdfs.TestLeaseRecovery2 > Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2 > testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2) > Time elapsed: 4.375 sec <<< FAILURE! > java.lang.AssertionError: lease holder should now be the NN > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437) > testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2) > Time elapsed: 4.339 sec <<< FAILURE! > java.lang.AssertionError: lease holder should now be the NN > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520) > at > org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443) > Results : > Failed tests: > > TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568 > lease holder should now be the NN > > TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568 > lease holder should now be the NN > Tests run: 7, Failures: 2, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14064) WEBHDFS: Support Enable/Disable EC Policy
[ https://issues.apache.org/jira/browse/HDFS-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685000#comment-16685000 ] Kitti Nanasi commented on HDFS-14064: - Thanks for the new patch [~ayushtkn]! Iterating through the policies in the tests could be organised into a function for better readability. +1 (non binding) pending on that > WEBHDFS: Support Enable/Disable EC Policy > - > > Key: HDFS-14064 > URL: https://issues.apache.org/jira/browse/HDFS-14064 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14064-01.patch, HDFS-14064-02.patch, > HDFS-14064-03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14064) WEBHDFS: Support Enable/Disable EC Policy
[ https://issues.apache.org/jira/browse/HDFS-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683591#comment-16683591 ] Kitti Nanasi commented on HDFS-14064: - Thanks [~ayushtkn] for working on this! The code looks good to me, I just have minor comments about the tests: - I think the IOException shouldn't be caught in the tests, because it is not expected and it will hide actual errors. - The test case should fail if the policy is not found instead of silently succeeding - I would do an assertion after the disablePolicy to make sure that we are really running the test on a disabled policy. For example if the default policy couldn't be disabled (which is not true currently), the enable policy test would succeed, but wouldn't really test anything. > WEBHDFS: Support Enable/Disable EC Policy > - > > Key: HDFS-14064 > URL: https://issues.apache.org/jira/browse/HDFS-14064 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14064-01.patch, HDFS-14064-02.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14060) HDFS fetchdt command to return error codes on success/failure
[ https://issues.apache.org/jira/browse/HDFS-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi reassigned HDFS-14060: --- Assignee: Kitti Nanasi > HDFS fetchdt command to return error codes on success/failure > - > > Key: HDFS-14060 > URL: https://issues.apache.org/jira/browse/HDFS-14060 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Assignee: Kitti Nanasi >Priority: Major > > The {{hdfs fetchdt}} command always returns 0, even when there's been an > error (no token issued, no file to load, usage, etc). This means its not that > useful as a command line tool for testing or in scripts. > Proposed: exit non-zero for errors; reuse LaucherExitCodes for these -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org