[jira] [Created] (HBASE-5813) Retry immediately after a NotServingRegionException in a multiput
Retry immediately after a NotServingRegionException in a multiput - Key: HBASE-5813 URL: https://issues.apache.org/jira/browse/HBASE-5813 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin After we get some errors in a multiput we invalidate the region location cache and wait for the configured time interval according to the backoff policy. However, if all "errors" in multiput processing were NotServingRegionExceptions, we don't really need to wait. We can retry immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5803) [89-fb] Upgrade hbase 0.89-fb to Thrift 0.8.0 and bring Thrift server enhancements from trunk
[89-fb] Upgrade hbase 0.89-fb to Thrift 0.8.0 and bring Thrift server enhancements from trunk - Key: HBASE-5803 URL: https://issues.apache.org/jira/browse/HBASE-5803 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin TBoundedThreadPoolServer has been a problem for us when there is a large number of clients. We need to migrate to 0.8.0. in 89-fb and bring the relevant improvements from trunk, including supporting TThreadedSelectorServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5763) Fix random failures in TestFSErrorsExposed
Fix random failures in TestFSErrorsExposed -- Key: HBASE-5763 URL: https://issues.apache.org/jira/browse/HBASE-5763 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5744) Thrift server metrics should be long instead of int
Thrift server metrics should be long instead of int --- Key: HBASE-5744 URL: https://issues.apache.org/jira/browse/HBASE-5744 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Priority: Minor As we measure our Thrift call latencies in nanoseconds, we need to make latencies long instead of int everywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5731) Make max line length 100 in linter
Make max line length 100 in linter -- Key: HBASE-5731 URL: https://issues.apache.org/jira/browse/HBASE-5731 Project: HBase Issue Type: New Feature Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor We have switched to 100 characters per line in our Java files. Making the change in the linter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5730) [89-fb] Make HRegionThriftServer's thread pool bounded
[89-fb] Make HRegionThriftServer's thread pool bounded -- Key: HBASE-5730 URL: https://issues.apache.org/jira/browse/HBASE-5730 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5708) [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test
[89-fb] Make MiniMapRedCluster directory a subdirectory of target/test -- Key: HBASE-5708 URL: https://issues.apache.org/jira/browse/HBASE-5708 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Priority: Minor Some map-reduce-based tests are failing when executed concurrently in 89-fb because mini-map-reduce cluster uses /tmp/hadoop- for temporary data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5703) Bound the number of threads in HRegionThriftServer
Bound the number of threads in HRegionThriftServer -- Key: HBASE-5703 URL: https://issues.apache.org/jira/browse/HBASE-5703 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin We need to bound the number of threads spawned in HRegionThriftServer, similarly to what was done in HBASE-4863 to the standalone Thrift gateway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5700) [89-fb] Fix TestMiniClusterLoad* test failures
[89-fb] Fix TestMiniClusterLoad* test failures -- Key: HBASE-5700 URL: https://issues.apache.org/jira/browse/HBASE-5700 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Porting TestMiniClusterLoad* tests to 89-fb in HBASE-5679 uncovered certain problems with mini-cluster setup in 89-fb that need to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5684) Make ProcessBasedLocalHBaseCluster run HDFS and make it more robust
Make ProcessBasedLocalHBaseCluster run HDFS and make it more robust --- Key: HBASE-5684 URL: https://issues.apache.org/jira/browse/HBASE-5684 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Currently ProcessBasedLocalHBaseCluster runs on top of raw local filesystem. We need it to start a process-based HDFS cluster as well. We also need to make the whole thing more stable so we can use it in unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5679) [89-fb] Port load test tool and related unit tests from trunk to 89-fb
[89-fb] Port load test tool and related unit tests from trunk to 89-fb -- Key: HBASE-5679 URL: https://issues.apache.org/jira/browse/HBASE-5679 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin When open-sourcing LoadTestTool that originated in 89-fb, numerous improvements to the tool were made, and unit tests based on the tool were created as part of HBASE-4908. These improvements need to be ported back to 89-fb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5612) Data types for HBase values
Data types for HBase values --- Key: HBASE-5612 URL: https://issues.apache.org/jira/browse/HBASE-5612 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin In many real-life applications all values in a certain column family are of a certain data type, e.g. 64-bit integer. We could specify that in the column descriptor and enable data type-specific compression such as variable-length integer encoding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5602) Add cache access pattern statistics and report hot blocks/keys
Add cache access pattern statistics and report hot blocks/keys -- Key: HBASE-5602 URL: https://issues.apache.org/jira/browse/HBASE-5602 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin In many practical applications it would be very useful to know how well utilized the block cache is, i.e. how many times we actually access a block once it gets into cache. This would also allow to evaluate cache-on-write on flush. In addition, we need to keep track of and report some set of hottest block in cache, and possibly even hottest keys. This would allow to diagnose "hot-row" problems in real time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5601) Add per-column-family data block cache hit ratios
Add per-column-family data block cache hit ratios - Key: HBASE-5601 URL: https://issues.apache.org/jira/browse/HBASE-5601 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin In addition to the overall block cache hit ratio it would be extremely useful to have per-column-family data block cache hit ratio metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5576) Configure Arcanist lint engine for HBase
Configure Arcanist lint engine for HBase Key: HBASE-5576 URL: https://issues.apache.org/jira/browse/HBASE-5576 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin We need to be able to use "arc lint" to check a patch for code style errors before submission. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5575) Configure Arcanist lint engine for HBase
Configure Arcanist lint engine for HBase Key: HBASE-5575 URL: https://issues.apache.org/jira/browse/HBASE-5575 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin We need to enable Arcanist lint engine in HBase, so that a commit could be checked by running "arc lint". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover
[89-fb] Region server can get stuck getMaster on master failover Key: HBASE-5566 URL: https://issues.apache.org/jira/browse/HBASE-5566 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Mikhail Bautin Assignee: Mikhail Bautin Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5557) [89-fb] Fix incorrect writer / thread interaction in HBaseTest
[89-fb] Fix incorrect writer / thread interaction in HBaseTest -- Key: HBASE-5557 URL: https://issues.apache.org/jira/browse/HBASE-5557 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor In the HBaseTest load test we have a condition when the writer has not written any keys but the reader might attempt to read key 0, resulting in a failure. This bug is specific to 89-fb because it has been fixed while open-sourcing HBaseTest as LoadTestTool, and those improvements still have not been back-ported to 89-fb. Doing a temporary fix now and we will get to the back-port later. 12/03/09 14:12:52 INFO utils.MultiThreadedReader: Key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 ERROR utils.MultiThreadedReader: No data returned, tried to get actions for key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 INFO utils.MultiThreadedReader: Key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 INFO utils.MultiThreadedReader: Key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 ERROR utils.MultiThreadedReader: No data returned, tried to get actions for key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 ERROR utils.MultiThreadedReader: No data returned, tried to get actions for key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 INFO utils.MultiThreadedReader: Key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 ERROR utils.MultiThreadedReader: No data returned, tried to get actions for key = cfcd208495d565ef66e7dff9f98764da:0 12/03/09 14:12:52 ERROR utils.MultiThreadedReader: Aborting run -- found more than three errors -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5470) Make DataBlockEncodingTool work correctly with no native compression codecs loaded
Make DataBlockEncodingTool work correctly with no native compression codecs loaded -- Key: HBASE-5470 URL: https://issues.apache.org/jira/browse/HBASE-5470 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor DataBlockEncodingTool was fixed as part of porting data block encoding (HBASE-4218) to 89-fb (https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291, https://reviews.facebook.net/D1659). The bug appeared when using GZ as baseline compression codec but not loading native Hadoop libraries, in which case the compressor instance would be null. The purpose of this JIRA is to bring the trunk version of DataBlockEncodingTool to parity with the trunk version, and further improvements to the tool will be made separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5442) Use builder pattern in StoreFile and HFile
Use builder pattern in StoreFile and HFile -- Key: HBASE-5442 URL: https://issues.apache.org/jira/browse/HBASE-5442 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a "builder pattern" solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, ) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses StoreFile and HFile refactoring. For HColumnDescriptor refactoring see HBASE-5357. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5387) Reuse compression streams in HFileBlock.Writer
Reuse compression streams in HFileBlock.Writer -- Key: HBASE-5387 URL: https://issues.apache.org/jira/browse/HBASE-5387 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin We need to to reuse compression streams in HFileBlock.Writer instead of allocating them every time. The motivation is that when using Java's built-in implementation of Gzip, we allocate a new GZIPOutputStream object and an associated native data structure any time. This is one suspected cause of recent TestHFileBlock failures on Hadoop QA: https://builds.apache.org/job/HBase-TRUNK/2658/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5382) Test that we always cache index and bloom blocks
Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5375) Ensure that compactions use already cached blocks but do not cache new data blocks
Ensure that compactions use already cached blocks but do not cache new data blocks -- Key: HBASE-5375 URL: https://issues.apache.org/jira/browse/HBASE-5375 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Create a unit test to verify that compactions reuse existing cached blocks but do not thrash the cache with newly read blocks. Also need to verify that we only read every data block once, e.g. that we don't re-read the block on every next() operation. HBASE-1597 did not seem to include a unit test, so we need to add a test now. This and HBASE-4683 (the unit test that was not checked in) are the remaining missing pieces before we can close HBASE-3976. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5357) Use builder pattern in StoreFile, HFile, and HColumnDescriptor instantiation
Use builder pattern in StoreFile, HFile, and HColumnDescriptor instantiation Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a "builder pattern" solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, ) .setParameter1(value1) .setParameter2(value2) ... .instantiate(); {code} Each parameter setter being on the same line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5344) [89-fb] Scan unassigned region directory on master failover
[89-fb] Scan unassigned region directory on master failover --- Key: HBASE-5344 URL: https://issues.apache.org/jira/browse/HBASE-5344 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin In case the master dies after a regionserver writes region state as OPENED or CLOSED in ZK but before the update is received by master and written to meta, the new master that comes up has to pick up the region state from ZK and write it to meta. Otherwise we can get multiply-assigned regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5320) Create client API to handle HBase maintenance gracefully
Create client API to handle HBase maintenance gracefully Key: HBASE-5320 URL: https://issues.apache.org/jira/browse/HBASE-5320 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Priority: Minor When we do HBase cluster maintenance, we typically have to manually stop or disable the client temporarily. It would be nice to have a way for the client to find out that HBase in undergoing maintenance through an appropriate API and gracefully handle it on its own. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5263) Preserving cached data on compactions through cache-on-write
Preserving cached data on compactions through cache-on-write Key: HBASE-5263 URL: https://issues.apache.org/jira/browse/HBASE-5263 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block cache on compactions if cache-on-write is enabled. However, it would be ideal to reduce the effect compactions have on the cached data. For every block we are writing for a compacted file we can decide whether it needs to be cached based on whether the original blocks containing the same data were already in cache. More precisely, for every HFile reader in a compaction we can maintain a boolean flag saying whether the current key-value came from a disk IO or the block cache. In the HFile writer for the compaction's output we can maintain a flag that is set if any of the key-values in the block being written came from a cached block, use that flag at the end of a block to decide whether to cache-on-write the block, and reset the flag to false on a block boundary. If such an inclusive approach would still trash the cache, we could restrict the total number of blocks to be cached per an output HFile, switch to an "and" logic instead of "or" logic for deciding whether to cache an output file block, or only cache a certain percentage of output file blocks that contain some of the previously cached data. Thanks to Nicolas for this elegant online algorithm idea! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance
Structured event log for HBase for monitoring and auto-tuning performance - Key: HBASE-5262 URL: https://issues.apache.org/jira/browse/HBASE-5262 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Creating this JIRA to open a discussion about a structured (machine-readable) log that will record events such as compaction start/end times, compaction input/output files, their sizes, the same for flushes, etc. This can be stored e.g. in a new system table in HBase itself. The data from this log can then be analyzed and used to optimize compactions at run time, or otherwise auto-tune HBase configuration to reduce the number of knobs the user has to configure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5261) Update HBase for Java 7
Update HBase for Java 7 --- Key: HBASE-5261 URL: https://issues.apache.org/jira/browse/HBASE-5261 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin We need to make sure that HBase compiles and works with JDK 7. Once we verify it is reasonably stable, we can explore utilizing the G1 garbage collector. When all deployments are ready to move to JDK 7, we can start using new language features, but in the transition period we will need to maintain a codebase that compiles both with JDK 6 and JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5230) Unit test to ensure compactions don't cache data on write
Unit test to ensure compactions don't cache data on write - Key: HBASE-5230 URL: https://issues.apache.org/jira/browse/HBASE-5230 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Create a unit test for HBASE-3976 (making sure we don't cache data blocks on write during compactions even if cache-on-write is enabled generally enabled). This is because we have very different implementations of HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) and with CacheConfig (presumably it's there but not sure if it even works, since the patch in HBASE-3976 may not have been committed). We need to create a unit test to verify that we don't cache data blocks on write during compactions, and resolve HBASE-3976 so that this new unit test does not fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5224) midkey() returns 12 extra bytes in HFile v2
midkey() returns 12 extra bytes in HFile v2 --- Key: HBASE-5224 URL: https://issues.apache.org/jira/browse/HBASE-5224 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor HFile's midkey() is implemented as the first key of the middle index block both HFile v1 and HFile v2 (the middle leaf index block is used in v2). However, in HFile v2 midkey() currently grabs 12 more bytes from the next leaf index entry, representing the offset and compressed size of the data block pointed to by that entry. While this probably does not affect the interpretation of the returned buffer as an HBase key (the last 12 bytes are simply discarded), this has to be cleaned up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5130) A map-reduce wrapper for HBase test suite ("mrunit")
A map-reduce wrapper for HBase test suite ("mrunit") Key: HBASE-5130 URL: https://issues.apache.org/jira/browse/HBASE-5130 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin We have a tool we call "mrunit" that runs HBase unit tests on a map-reduce cluster. We need modify it to use distributed cache to deploy the code on the cluster instead of our internal deployment tool, and open-source it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5048) Use EnvironmentEdgeManager.currentTimeMillis() instead of System.currentTimeMillis()
Use EnvironmentEdgeManager.currentTimeMillis() instead of System.currentTimeMillis() Key: HBASE-5048 URL: https://issues.apache.org/jira/browse/HBASE-5048 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Priority: Minor We need to switch to using EnvironmentEdgeManager.currentTimeMillis() instead of System.currentTimeMillis() across the codebase to reduce confusion when writing tests that require custom timing of operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner
[89-fb] Remove hard-coded non-existent host name from TestScanner -- Key: HBASE-5031 URL: https://issues.apache.org/jira/browse/HBASE-5031 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Priority: Minor TestScanner is failing on 0.89-fb because it has a hard-coded fake host name that it is trying to look up. Replacing this with 127.0.0.1: instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5012) Unify data/index/bloom and meta block read methods in HFileReaderV{1,2}
Unify data/index/bloom and meta block read methods in HFileReaderV{1,2} --- Key: HBASE-5012 URL: https://issues.apache.org/jira/browse/HBASE-5012 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Priority: Minor Reduce code duplication between getMetaBlock and readBlock in HFileReaderV1 and HFileReaderV2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5010) Filter HFiles based on TTL
Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin In ScanWildcardColumnTracker we have { this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp < oldestStamp; } } but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a "default timerange filter" to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5000) Speed up simultaneous reads of a block when block caching is turned off
Speed up simultaneous reads of a block when block caching is turned off --- Key: HBASE-5000 URL: https://issues.apache.org/jira/browse/HBASE-5000 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Priority: Minor With block caching, when one client starts reading a block and another one comes around asking for the same block, the second client waits for the first one to finish reading and returns the block from cache. This is achieved by locking on the block offset using IdLock, a "sparse lock" primitive allowing to lock on arbitrary long numbers. However, in case there is no block caching, there is no reason to wait for other clients that are reading the same block. One challenge optimizing this that we don't necessary have accurate information about whether other HFile API clients interested in the block would cache it. Setting priority as minor, as it is very unusual to turn off block caching. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4976) Add compaction/flush queue size metrics mistakenly removed by HFile v2
Add compaction/flush queue size metrics mistakenly removed by HFile v2 -- Key: HBASE-4976 URL: https://issues.apache.org/jira/browse/HBASE-4976 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4963) [89-fb] Per-table getsize metrics are broken
[89-fb] Per-table getsize metrics are broken Key: HBASE-4963 URL: https://issues.apache.org/jira/browse/HBASE-4963 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin We need to make sure we get per-(table, CF) get size metrics in 0.89-fb, similarly to what was done in https://reviews.facebook.net/D483 for the trunk. Currently we only get metrics such as hadoop.regionserver_cf..getsize even with per-table metrics turned on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4962) Optimize time range scans using a delete Bloom filter
Optimize time range scans using a delete Bloom filter - Key: HBASE-4962 URL: https://issues.apache.org/jira/browse/HBASE-4962 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor To speed up time range scans we need to seek to the maximum timestamp of the requested range,instead of going to the first KV of the (row, column) pair and iterating from there. If we don't know the (row, column), e.g. if it is not specified in the query, we need to go to end of the current row/column pair first, get a KV from there, and do another seek to (row', column', timerange_max) from there. We can only skip over to the timerange_max timestamp when we know that there are no DeleteColumn records at the top of that row/column with a higher timestamp. We can utilize another Bloom filter keyed on (row, column) to quickly find that out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4953) Root region does not get assigned after doing kill -9 on all daemons and restarting HBase
Root region does not get assigned after doing kill -9 on all daemons and restarting HBase - Key: HBASE-4953 URL: https://issues.apache.org/jira/browse/HBASE-4953 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin When doing a kill -9 on all HBase processes and attempting to re-start HBase, the master does not properly assign the root region. The /hbase/root-region-server znode still contains the old regionserver, but the regionserver referenced in it does not get assigned the root region. This might get resolved after the znode expires, though, but some testing is required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4952) Master startup is too slow on HBase trunk
Master startup is too slow on HBase trunk - Key: HBASE-4952 URL: https://issues.apache.org/jira/browse/HBASE-4952 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin When I start the HBase trunk master on my five-node cluster, it gets stuck in the state "initializing master service threads" for a minute or two, then "waiting for regionserver number to settle", and only then starts log splitting. We don't have such delays in the 0.89-fb master, and I believe we can optimize the new master to eliminate these delays as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)
HBase cluster test tool (port from 0.89-fb) --- Key: HBASE-4908 URL: https://issues.apache.org/jira/browse/HBASE-4908 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Porting one of our HBase cluster test tools (a single-process multi-threaded load generator and verifier) from 0.89-fb to trunk. I cleaned up the code a bit compared to what's in 0.89-fb, and discovered that it has some features that I have not tried yet (some kind of a kill test, and some way to run HBase as multiple processes on one machine). The main utility of this piece of code for us has been the HBaseClusterTest command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a load test in our five-node dev cluster testing, e.g.: hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn load_test -read 1:10:50:20 -zk -bloom ROWCOL -compression GZIP I will be using this code to load-test the delta encoding patch and making fixes, but I am submitting the patch for early feedback. I will probably try out its other functionality and comment on how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4867) A tool to merge configuration files
A tool to merge configuration files --- Key: HBASE-4867 URL: https://issues.apache.org/jira/browse/HBASE-4867 Project: HBase Issue Type: New Feature Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor With our cluster configuration setup it would be good to have a tool that would merge HBase configuration, so that files appearing later in the list would override properties specified in earlier files. This way we could merge application-specific configuration file with the cluster-specific configuration file (with the latter overriding the former) and produce a single HBase configuration file to install on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test
Make HBase Thrift server more configurable and add a command-line UI test - Key: HBASE-4863 URL: https://issues.apache.org/jira/browse/HBASE-4863 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin This started as an internal hotfix where we found out that the Thrift server spawned 15000 threads. To bound the thread pool size I added a custom thread pool server implementation called HBaseThreadPoolServer into HBase codebase, and made the following parameters configurable from both command line and as config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. Under an increasing load, the server creates new threads for every connection before the pool size reaches minWorkerThreads. After that, the server puts new connections into the queue and only creates a new thread when the queue is full. If an attempt to create a new thread fails, the server drops connection. The default TThreadPoolServer would crash in that case, but it never happened because the thread pool was unbounded, so the server would hang indefinitely, consume a lot of memory, and cause huge latency spikes on the client side. Another part of this fix is refactoring and unit testing of the command-line part of the Thrift server. The logic there is sufficiently complicated, and the existing ThriftServer class does not test that part at all. The new TestThriftServerCmdLine test starts the Thrift server on a random port with various combinations of options and talks to it through the client API from another thread. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4824) TestZKLeaderManager is flaky
TestZKLeaderManager is flaky Key: HBASE-4824 URL: https://issues.apache.org/jira/browse/HBASE-4824 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Priority: Minor TestZKLeaderManager is flaky. It failed in a full test suite run for me, then passed when I reran it locally, but then failed when I ran it in a loop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase
A fully automated comprehensive distributed integration test for HBase -- Key: HBASE-4821 URL: https://issues.apache.org/jira/browse/HBASE-4821 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin To properly verify that a particular version of HBase is good for production deployment we need a better way to do real cluster testing after incremental changes. Running unit tests is good, but we also need to deploy HBase to a cluster, run integration tests, load tests, Thrift server tests, kill some region servers, kill the master, and produce a report. All of this needs to happen in 20-30 minutes with minimal manual intervention. I think this way we can combine agile development with high stability of the codebase. I am envisioning a high-level framework written in a scripting language (e.g. Python) that would abstract external operations such as "deploy to test cluster", "kill a particular server", "run load test A", "run load test B" (we already have a few kinds of load tests implemented in Java, and we could write a Thrift load test in Python). This tool should also produce intermediate output, allowing to catch problems early and restart the test. No implementation has yet been done. Any ideas or suggestions are welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4809) Per-CF set RPC metrics
Per-CF set RPC metrics -- Key: HBASE-4809 URL: https://issues.apache.org/jira/browse/HBASE-4809 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Priority: Minor Porting per-CF set metrics for RPC times and response sizes from 0.89-fb to trunk. For each "mutation signature" (a set of column families involved in an RPC request) we increment several metrics, allowing to monitor access patterns. We deal with guarding against an explosion of the number of metrics in HBASE-4638 (which might even be implemented as part of this JIRA). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4795) Fix TestHFileBlock when running on a 32-bit JVM
Fix TestHFileBlock when running on a 32-bit JVM --- Key: HBASE-4795 URL: https://issues.apache.org/jira/browse/HBASE-4795 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Our Hudson test server seems to run a 32-bit JVM. This patch fixes TestHFileBlock to work correctly for both 64-bit and 32-bit JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4768) Per-(table, columnFamily) metrics with configurable table name inclusion
Per-(table, columnFamily) metrics with configurable table name inclusion Key: HBASE-4768 URL: https://issues.apache.org/jira/browse/HBASE-4768 Project: HBase Issue Type: New Feature Reporter: Mikhail Bautin Assignee: Mikhail Bautin As we kept adding more granular block read and block cache usage statistics, a combinatorial explosion of various cases to monitor started to happen, especially when we wanted both per-table/column family/block type statistics and aggregate statistics on various subsets of these dimensions. Here, we un-clutters HFile readers, LruBlockCache, StoreFile, etc. by creating a centralized class that knows how to update all kinds of per-table/CF/block type counters. Table name and column family configuration have been pushed to a base class, SchemaConfigured. This is convenient as many of existing classes that have these properties (HFile readers/writers, HFile blocks, etc.) did not have a base class. Whether to collect per-(table, columnFamily) or per-columnFamily only metrics can be configured with the hbase.metrics.showTableName configuration key. We don't expect this configuration to change at runtime, so we cache the setting statically and log a warning when an attempt is made to flip it once already set. This way we don't have to pass configuration to a lot more places, e.g. everywhere an HFile reader is instantiated. Thanks to Liyin for his initial version of per-table metrics patch and a lot of valuable feedback. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4758) [89-fb] Make test methods independent in TestMasterTransitions
[89-fb] Make test methods independent in TestMasterTransitions -- Key: HBASE-4758 URL: https://issues.apache.org/jira/browse/HBASE-4758 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Currently TestMasterTransitions is flaky, and one way to hopefully make it more stable is to create a separate MiniHBaseCluster for every test method, and get rid of BeforeClass/AfterClass. So far I have successfully run TestMasterTransitions a few times with the fix, while it was failing without the fix. TestMasterTransitions in trunk is a different story (most of the test is commented out in the trunk) and is out of scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4757) [89-fb] Fix TestHQuorumPeer for non-default values of hbase.tmp.dir
[89-fb] Fix TestHQuorumPeer for non-default values of hbase.tmp.dir Key: HBASE-4757 URL: https://issues.apache.org/jira/browse/HBASE-4757 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor TestHQuorumPeer currently fails if hbase.tmp.dir is different from /tmp/hbase-. However, for our internal parallel test runner we use a different temporary HBase directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4704) A JRuby script for identifying active master
A JRuby script for identifying active master Key: HBASE-4704 URL: https://issues.apache.org/jira/browse/HBASE-4704 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Trivial Fix For: 0.94.0 This simple script reads the HBase master ZK node and outputs the hostname of the active master. This is needed so that operational scripts can decide where the primary master is running. I am also including a one-line hbase-jruby script so we can make our jruby scripts proper UNIX executables by including an "#!/usr/bin/env hbase-jruby" at the top. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4686) [89-fb] Fix per-store metrics aggregation
[89-fb] Fix per-store metrics aggregation -- Key: HBASE-4686 URL: https://issues.apache.org/jira/browse/HBASE-4686 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin In r1182034 per-Store metrics were broken, because the aggregation of StoreFile metrics over all stores in a region was replaced by overriding them every time. We saw these metrics drop by a factor of numRegions on a production cluster -- thanks to Kannan for noticing this! We need to fix the metrics and add a unit test to ensure regressions like this don't happen in the future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4607) SplitLogWorker should correctly terminate when waiting for ZK node
SplitLogWorker should correctly terminate when waiting for ZK node -- Key: HBASE-4607 URL: https://issues.apache.org/jira/browse/HBASE-4607 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4534) A new unit test for lazy seek and StoreScanner in general
A new unit test for lazy seek and StoreScanner in general - Key: HBASE-4534 URL: https://issues.apache.org/jira/browse/HBASE-4534 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Mikhail Bautin Assignee: Mikhail Bautin A randomized unit test for Gets/Scans (all-row, single-row, multi-row, all-column, single-column, and multi-column). Also all combinations of Bloom filters and compression (NONE vs GZIP) are tested. The unit test flushes multiple StoreFiles with disjoint timestamp ranges and runs various types of queries against them. Currently we are not testing overlapping timestamp ranges. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4522) Make hbase-site-custom.xml override the hbase-site.xml
Make hbase-site-custom.xml override the hbase-site.xml -- Key: HBASE-4522 URL: https://issues.apache.org/jira/browse/HBASE-4522 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Liyin Tang Priority: Minor Fix For: 0.94.0 The motivation for diff is that we want to override some config change for any specific cluster easily by just adding the config entries in the hbase-site-custom.xml for that cluster. This change adds the hbase-site-custom.xml configuration file into HBaseConfiguration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4520) Better handling of Bloom filter type discrepancy between HFile and CF config
Better handling of Bloom filter type discrepancy between HFile and CF config Key: HBASE-4520 URL: https://issues.apache.org/jira/browse/HBASE-4520 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Fix For: 0.94.0 Modify StoreFile to make it clear where Bloom filter type settings come from. We have two sources of truth: (1) HFile; and (2) CF configuration. (1) takes precedence in the reader, and (2) takes precedence in the writer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4516) HFile-level load tester with compaction and random-read workloads
HFile-level load tester with compaction and random-read workloads - Key: HBASE-4516 URL: https://issues.apache.org/jira/browse/HBASE-4516 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Priority: Minor Fix For: 0.94.0 This is a load testing tool for HFile implementations, which supports two workloads: - Compactions (merge the input HFiles). A special case of this is only one input, which allows to do HFile format conversions. - Random reads. Launches the specified number of threads that do seeks and short scans on randomly generated keys. The original purpose of this tool was to ensure that HFile format v2 did not introduce performance regressions. Keys for the read workload are generated randomly between the first and the last key of the HFile. At each position, instead of precisely calculating the correct probability for every byte value b, we select a uniformly random byte between in the allowed [low, high] range. In addition, there is a heuristic that determines the positions at which the key has hex characters, and the random key contains hex characters at those positions as well. Example output for the random read workload: Time: 120 sec, seek/sec: 8290, kv/sec: 30351, kv bytes/sec: 91868121, blk/sec: 10147, unique keys: 232779 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira