[jira] [Created] (HBASE-21211) Can't Read Partitions File - Partitions File deleted
KSHITIJ GAUTAM created HBASE-21211: -- Summary: Can't Read Partitions File - Partitions File deleted Key: HBASE-21211 URL: https://issues.apache.org/jira/browse/HBASE-21211 Project: HBase Issue Type: Bug Affects Versions: 1.5.0, 1.6.0 Environment: * HBase Version: 1.2.0-cdh5.11.1 (the line that deletes the file still exists) * hadoop version * Hadoop 2.6.0-cdh5.11.1 * Subversion http://github.com/cloudera/hadoop -r b581c269ca3610c603b6d7d1da0d14dfb6684aa3 * From source with checksum c6cbc4f20a8a571dd7c9f743984da1 * This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.11.1.jar Reporter: KSHITIJ GAUTAM Fix For: 1.5.0, 1.6.0 Attachments: 0001-do-not-delete-the-partitions-file-if-the-session-is-.patch Hi team, we have a MapReduce job that uses the bulkload option instead of direct puts to import data e.g., {code:java} HFileOutputFormat2.configureIncrementalLoad(job, table, locator);{code} However we have been running into a situation where partitions file is deleted by the termination of the JVM process, where JVM process kicks off the MapReduce job but it's also waiting to run the `configureIncrementalLoad` that executes the configurePartitioner. _Error: java.lang.IllegalArgumentException: Can't read partitions file at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)_ We think the line#827 of [HFileOutputFormat2|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java#L827] could be the root cause. {code:java} fs.deleteOnExit(partitionsPath);{code} We have created our custom HFileOutputFormat that doesn't delete the partitions file and have fixed the problem for our cluster. We propose that a cleanup method could be created which deletes the partitions file once all the mappers have finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746028#comment-13746028 ] gautam commented on HBASE-9108: --- By adding the option to ignore exceptions while writing, we are ensuring that during CM actions whatever keys were written successfully, can be read. We are ensuring 100% read guarantee. Currently LoadTestTool is pretty rigid, and looks for 100% write as well as 100% read guarantee. There are use cases, where the client applications instead of attempting the write again or go into an infinite loop, might want to handle it differently (like storing those failed keys into a list to retry once after the batch write is done, etc). This will handle that use case. Again, we are providing that as a configuration. Tomorrow when you want to run the same set of tests to get 100% write guarantee as well, over say a stronger better hbase version, you just need to remove the configuration. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746084#comment-13746084 ] gautam commented on HBASE-9108: --- The retry logic inside HBase already does what you mention (storing failed keys and retrying). But sometimes, the retry logic fails with org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException, and hence fails the test as the key becomes the failed key, and hence you need to tune your env, which sometimes is a small and the only cluster setup. Or as you said you need to fine tune CM, which then you would need to vary for different cluster setups to get a better MTTR. Some other time you observe the key has failed to write, because of: java.io.EOFException org.apache.hadoop.hbase.NotServingRegionException, org.apache.hadoop.hbase.client.NoServerForRegionException, org.apache.hadoop.hbase.ipc.ServerNotRunningYetException, org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException A tester might want to skip the retrial attempts here, skip the key here and proceed, and he can configure the exceptions he want to skip on write by passing it over as configuration. Since this wont be available by default in hbase configuration xmls, this is a known risk he will take. And sorry I didnt mean to say that, I agree we already have a stronger better hbase version. My intent was for future version upgrades, tester might want to go for 100% read+write, as he might have moved to a better a big cluster setup with a better MTTR. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746556#comment-13746556 ] gautam commented on HBASE-9108: --- Enis, do you think this can be committed now? LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk.8 I was doing git apply patch, which was not returning anything. And the compilation was also fine. When I did patch -p1 patch that really applied and I could fix the compilation issue thereafter. Learnt patching... LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738271#comment-13738271 ] gautam commented on HBASE-9108: --- Can we merge this patch, before it gets overwritten with some other patch? LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7, HBASE-9108.patch._trunk.8 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk.7 Looks like things have changed a bit. This is with the latest changes. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4, HBASE-9108.patch._trunk.7 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: 9108.patch._trunk.6 Here is with the isEmpty() added. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, 9108.patch._trunk.6, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk.2 HBASE-9108.patch._0.94.2 Except changing to package private, rest of the changes are done. I dont think there is a need to mark it package private. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._0.94, HBASE-9108.patch._0.94.2, HBASE-9108.patch._trunk, HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: (was: HBASE-9108.patch._0.94) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: (was: HBASE-9108.patch._0.94.2) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731866#comment-13731866 ] gautam commented on HBASE-9108: --- Yes, I removed others. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: (was: HBASE-9108.patch._trunk) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk.3 Here i go. Its the same as trunk.2 LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731966#comment-13731966 ] gautam commented on HBASE-9108: --- I am not getting why the patch is not being applied LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: In Progress (was: Patch Available) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk.4 How about this? LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: In Progress) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733071#comment-13733071 ] gautam commented on HBASE-9108: --- Thanks [~ted_yu], for fixing the patch. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: 9108.patch._trunk.5, HBASE-9108.patch._trunk.2, HBASE-9108.patch._trunk.3, HBASE-9108.patch._trunk.4 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729300#comment-13729300 ] gautam commented on HBASE-9108: --- The sanity tests have gone fine here. Can we expect this to be checked in into 94.x mainline soon? LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._0.94, HBASE-9108.patch._trunk Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._trunk Here is the patch file for trunk/0.95. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._0.94, HBASE-9108.patch._trunk Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Status: Patch Available (was: Open) LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._0.94, HBASE-9108.patch._trunk Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726297#comment-13726297 ] gautam commented on HBASE-9108: --- I will shortly update the change I plan to do. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
gautam created HBASE-9108: - Summary: LoadTestTool need to have a way to ignore keys which were failed during write. Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0 Reporter: gautam Assignee: gautam Priority: Critical While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.
[ https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9108: -- Attachment: HBASE-9108.patch._0.94 Here is the change for 0.94 branch. LoadTestTool need to have a way to ignore keys which were failed during write. --- Key: HBASE-9108 URL: https://issues.apache.org/jira/browse/HBASE-9108 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Priority: Critical Attachments: HBASE-9108.patch._0.94 Original Estimate: 48h Remaining Estimate: 48h While running the chaosmonkey integration tests, it is found that write sometimes fails when the cluster components are restarted/stopped/killed etc.. The data key which was being put, using the LoadTestTool, is added to the failed key set, and at the end of the test, this failed key set is checked for any entries to assert failures. While doing fail-over testing, it is expected that some of the keys may go un-written. The point here is to validate that whatever gets into hbase for an unstable cluster really goes in, and hence read should be 100% for whatever keys went in successfully. Currently LoadTestTool has strict checks to validate every key being written or not. In case any keys is not written, it fails. I wanted to loosen this constraint by allowing users to pass in a set of exceptions they expect when doing put/write operations over hbase. If one of these expected exception set is thrown while writing key to hbase, the failed key would be ignored, and hence wont even be considered again for subsequent write as well as read. This can be passed to the load test tool as csv list parameter -allowed_write_exceptions, or it can be passed through hbase-site.xml by writing a value for test.ignore.exceptions.during.write Here is the usage: -allowed_write_exceptions java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException Hence, by doing this the existing integration tests can also make use of this change by passing it as property in hbase-site.xml, as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9085: -- Issue Type: Bug (was: Test) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.0, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-9085.patch._0.94, HBASE-9085.patch._0.95_or_trunk I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725938#comment-13725938 ] gautam commented on HBASE-9085: --- Thanks Enis. Yes, I tested this on my setup. The restore worked properly in all cases. You can commit this. Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.0, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.11 Attachments: HBASE-9085.patch._0.94, HBASE-9085.patch._0.95_or_trunk I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
gautam created HBASE-9085: - Summary: Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Test Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9085: -- Description: I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. was: Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Test Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9085: -- Affects Version/s: (was: 0.95.2) 0.95.0 0.94.9 0.94.10 Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Test Components: test Affects Versions: 0.95.0, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723644#comment-13723644 ] gautam commented on HBASE-9085: --- I will put a git patch with the fix soon. Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Test Components: test Affects Versions: 0.95.0, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
[ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-9085: -- Attachment: HBASE-9085.patch._0.94 HBASE-9085.patch._0.95_or_trunk Find the fix attached here. This will ensure we never get into a complete cluster shutdown, if initially marked active HBase Master is down before restore. Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly. Key: HBASE-9085 URL: https://issues.apache.org/jira/browse/HBASE-9085 Project: HBase Issue Type: Test Components: test Affects Versions: 0.95.0, 0.94.9, 0.94.10 Reporter: gautam Assignee: gautam Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-9085.patch._0.94, HBASE-9085.patch._0.95_or_trunk I was running the following test over a Distributed Cluster: bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test. For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task of restoring the cluster back to original state. The restore steps done here, does not solve one specific case: When the initial HBase Master is currently down, and the current HBase Master is different from the initial one. You get into this flow: //check whether current master has changed if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) { . } In the above code path, the current backup masters are stopped, and the current active master is also stopped. At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent attempts to do any operation over the cluster would fail, resulting in Test Failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Attachment: (was: HBASE-8928-trunk.patch) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Attachments: HBASE-8928-0.94.patch Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Attachment: HBASE-8928-trunk.patch Here is the patch for trunk. Yes, it will work both for 0.95 and trunk. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Attachments: HBASE-8928-0.94.patch, HBASE-8928-trunk.patch Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Attachment: HBASE-8928-0.94.patch Here's the new patch for 0.94. It seems some of the changes this request required, went in as part of HBASE-8908, on July 15. The remaining changes are part of this patch Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Attachments: HBASE-8928-0.94.patch, HBASE-8928-trunk.patch Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Attachment: (was: HBASE-8928-0.94.patch) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Attachments: HBASE-8928-0.94.patch, HBASE-8928-trunk.patch Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708314#comment-13708314 ] gautam commented on HBASE-8928: --- Here is the pull request. Again to avoid any confusion here: https://github.com/apache/hbase/pull/5 is over trunk branch https://github.com/apache/hbase/pull/4 is over 0.94 branch Elliott, I think you merged the 0.94 branch changes over trunk, hence it didnt work. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Attachment: HBASE-8928-trunk.patch HBASE-8928-0.94.patch GIT Patch files - HBASE-8928-0.94.patch HBASE-8928-trunk.patch to apply to trunk and the 0.94 branches respectively. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Attachments: HBASE-8928-0.94.patch, HBASE-8928-trunk.patch Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
gautam created HBASE-8928: - Summary: Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Integration and general system tests have been discussed previously, and the conclusion is that we need to unify how we do release candidate testing (HBASE-6091). In this issue, I would like to discuss and agree on a general plan, and open subtickets for execution so that we can carry out most of the tests in HBASE-6091 automatically. Initially, here is what I have in mind: 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 (without any tests). This will allow integration test to be run with {code} mvn verify {code} 2. Add ability to run all integration/system tests on a given cluster. Smt like: {code} mvn verify -Dconf=/etc/hbase/conf/ {code} should run the test suite on the given cluster. (Right now we can launch some of the tests (TestAcidGuarantees) from command line). Most of the system tests will be client side, and interface with the cluster through public APIs. We need a tool on top of MiniHBaseCluster or improve HBaseTestingUtility, so that tests can interface with the mini cluster or the actual cluster uniformly. 3. Port candidate unit tests to the integration tests module. Some of the candidates are: - TestAcidGuarantees / TestAtomicOperation - TestRegionBalancing (HBASE-6053) - TestFullLogReconstruction - TestMasterFailover - TestImportExport - TestMultiVersions / TestKeepDeletes - TestFromClientSide - TestShell and src/test/ruby - TestRollingRestart - Test**OnCluster - Balancer tests These tests should continue to be run as unit tests w/o any change in semantics. However, given an actual cluster, they should use that, instead of spinning a mini cluster. 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. All suggestions welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Description: Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools. was: Integration and general system tests have been discussed previously, and the conclusion is that we need to unify how we do release candidate testing (HBASE-6091). In this issue, I would like to discuss and agree on a general plan, and open subtickets for execution so that we can carry out most of the tests in HBASE-6091 automatically. Initially, here is what I have in mind: 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 (without any tests). This will allow integration test to be run with {code} mvn verify {code} 2. Add ability to run all integration/system tests on a given cluster. Smt like: {code} mvn verify -Dconf=/etc/hbase/conf/ {code} should run the test suite on the given cluster. (Right now we can launch some of the tests (TestAcidGuarantees) from command line). Most of the system tests will be client side, and interface with the cluster through public APIs. We need a tool on top of MiniHBaseCluster or improve HBaseTestingUtility, so that tests can interface with the mini cluster or the actual cluster uniformly. 3. Port candidate unit tests to the integration tests module. Some of the candidates are: - TestAcidGuarantees / TestAtomicOperation - TestRegionBalancing (HBASE-6053) - TestFullLogReconstruction - TestMasterFailover - TestImportExport - TestMultiVersions / TestKeepDeletes - TestFromClientSide - TestShell and src/test/ruby - TestRollingRestart - Test**OnCluster - Balancer tests These tests should continue to be run as unit tests w/o any change in semantics. However, given an actual cluster, they should use that, instead of spinning a mini cluster. 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. All suggestions welcome. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gautam updated HBASE-8928: -- Description: Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. was: Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705623#comment-13705623 ] gautam commented on HBASE-8928: --- Yes, for CM I will post the changes for review. For Policies I was trying to add Periodic Sequential policy, which at the moment I cant see. For LoadTestTool, I wanted to relax the verification criteria as I have observed that even if 1 write has failed out of say 1 million write operations, the test fails. I wanted to move it to percentile for the new actions where I will be using this framework. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705626#comment-13705626 ] gautam commented on HBASE-8928: --- Again I wont change LoadTest tool per se, I will add another tool, you can say LoadTestWithTolerance extending this tool. The only change I will be doing is to make the properties protected to be usable by my tool. Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705654#comment-13705654 ] gautam commented on HBASE-8928: --- I have added a pull request for this change here: https://github.com/apache/hbase/pull/4 Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8928) Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies.
[ https://issues.apache.org/jira/browse/HBASE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706543#comment-13706543 ] gautam commented on HBASE-8928: --- I picked up the latest 94.0 branch. May I know which one do I pick up for the latest changes? Make ChaosMonkey LoadTest tools extensible, to allow addition of more actions and policies. - Key: HBASE-8928 URL: https://issues.apache.org/jira/browse/HBASE-8928 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.95.2 Reporter: gautam Assignee: Enis Soztutar Let me split this requirement into 2 parts: i) ChaosMonkey I was trying to add more tests around new actions and policies by leveraging the existing classes nested inside ChaosMonkey. But it turned out that some of the classes cannot be used outside, unless we make those visible to the world. Here is an example: I cannot extend ChaosMonkey.Action, as the init(ActionContext context) method has package-wide visibility. There are other places as well which makes it impossible for anyone to extend on top of this hierarchy. ii) LoadTestTool I wanted to extend this tool to define failure/pass criteria based on % of read/write failed, rather than comparing against absolute 0. For that this beautiful class should mark some of its properties usable by its child, by marking those protected. I wanted to get unblocked here first. Once this gets fixed, I think I can take up a JIRA item to refactor these tools, if required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira