[jira] [Created] (HBASE-4224) Need a flush by regionserver rather than by table option
Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option
[ https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086843#comment-13086843 ] stack commented on HBASE-4224: -- We need by regionserver because doing it by table, its not possible to control how much flush is done. Imagine cluster has one table only. If you flush it, then all regionservers all flush at same time killing any other server going on concurrently because of the i/o load on hdfs. If you could do it by regionserver, you'd have some means of doing other than a big-bang flush, and so on. Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option
[ https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086846#comment-13086846 ] stack commented on HBASE-4224: -- An important bit I left off the top is that this 'flush by regionserver' needs to also clean up WAL logs too after the flushing is done. If everything has been flushed, then should be able to clean up any outstanding WAL logs. Scenario is the following. Your hosting service tells you that they are going to do a risky operation that could cause you to lose network connectivity or power to your racks. You know that a disaster may be coming and you want to mitigate the damage. A facility that flushed and cleaned up all WAL logs by the regionserver (so could throttle i/o) so if bad crash there is little to replay is what I'm talking about here. Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4176) Exposing HBase Filters to the Thrift API
[ https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-4176. -- Resolution: Fixed Fix Version/s: 0.92.0 Release Note: Means of specifying filters in thrift (and in shell) by passing a string specification Hadoop Flags: [Reviewed] Committed to TRUNK. Thank you for the fat patch and the for seeing it through Anirudh. Exposing HBase Filters to the Thrift API Key: HBASE-4176 URL: https://issues.apache.org/jira/browse/HBASE-4176 Project: HBase Issue Type: Improvement Components: thrift Reporter: Anirudh Todi Assignee: Anirudh Todi Priority: Minor Fix For: 0.92.0 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter Language(2).xml, Filter Language(3).docx, Filter Language.docx, HBASE-4176.patch, book.xml, book2.html, book2.xml Currently, to use any of the filters, one has to explicitly add a scanner for the filter in the Thrift API making it messy and long. With this patch, I am trying to add support for all the filters in a clean way. The user specifies a filter via a string. The string is parsed on the server to construct the filter. More information can be found in the attached document named Filter Language This patch is trying to extend and further the progress made by the patches in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option
[ https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086862#comment-13086862 ] stack commented on HBASE-4224: -- I'm seeing that a script that flushes all tables on the cluster does not bring on a log cleanup when expectation is that it should. Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: (was: HBase-4095-V9-branch.patch) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086886#comment-13086886 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/ --- (Updated 2011-08-18 08:31:59.585101) Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. Changes --- heapsize added. also addressed ted's review above. fixed formatting in hfileblock. (was inconsistent). Summary --- Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. This addresses bug HBase-4027. https://issues.apache.org/jira/browse/HBase-4027 Diffs (updated) - conf/hbase-env.sh 2d55d27 src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 Diff: https://reviews.apache.org/r/1214/diff Testing --- Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. Thanks, Li Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4027: - Attachment: hbase4027v12.1.diff added heapsize, fixed formatting, addressed ted yu's reviews. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: (was: HBase-4095-V9-trunk.patch) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: HBase-4095-V9-branch.patch HBase-4095-V9-trunk.patch I increased the timeout value to 1000. It's enough. Thanks, Ted. Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll
[jira] [Commented] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086890#comment-13086890 ] Jieshan Bean commented on HBASE-4095: - Sorry, it's 1 not 1000. Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434,
[jira] [Commented] (HBASE-4176) Exposing HBase Filters to the Thrift API
[ https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086918#comment-13086918 ] Hudson commented on HBASE-4176: --- Integrated in HBase-TRUNK #2123 (See [https://builds.apache.org/job/HBase-TRUNK/2123/]) HBASE-4176 Exposing HBase Filters to the Thrift API HBASE-4176 Exposing HBase Filters to the Thrift API stack : Files : * /hbase/trunk/src/docbkx/book.xml stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java * /hbase/trunk/src/main/ruby/shell/commands/scan.rb * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java * /hbase/trunk/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ParseConstants.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/filter/TestParseFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/RowFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/generated/TScan.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FamilyFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/DependentColumnFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnCountGetFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ParseFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/TimestampsFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java * /hbase/trunk/src/main/ruby/hbase/table.rb * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ValueFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/InclusiveStopFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/QualifierFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnRangeFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PageFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/CompareFilter.java Exposing HBase Filters to the Thrift API Key: HBASE-4176 URL: https://issues.apache.org/jira/browse/HBASE-4176 Project: HBase Issue Type: Improvement Components: thrift Reporter: Anirudh Todi Assignee: Anirudh Todi Priority: Minor Fix For: 0.92.0 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter Language(2).xml, Filter Language(3).docx, Filter Language.docx, HBASE-4176.patch, book.xml, book2.html, book2.xml Currently, to use any of the filters, one has to explicitly add a scanner for the filter in the Thrift API making it messy and long. With this patch, I am trying to add support for all the filters in a clean way. The user specifies a filter via a string. The string is parsed on the server to construct the filter. More information can be found in the attached document named Filter Language This patch is trying to extend and further the progress made by the patches in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4215) RS requestsPerSecond counter seems to be off
[ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-4215: - Assignee: ramkrishna.s.vasudevan RS requestsPerSecond counter seems to be off Key: HBASE-4215 URL: https://issues.apache.org/jira/browse/HBASE-4215 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off
[ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086939#comment-13086939 ] subramanian raghunathan commented on HBASE-4215: As part of the defect fix HBASE-3807 we proposed and modified the request attribute to requestPersecond both in RegionServer{RegionServerMetrics} and Master{HServerLoad} RegionServerMetrics calcualtes from MetricsRate Following is the code doing the calcualtion: {code} long now = System.currentTimeMillis(); long diff = (now-ts)/1000; if (diff == 0) diff = 1; // sigh this is crap. this.prevRate = (float)value / diff; {code} {color:red}this.prevRate = (float)value / diff;{color} prevRate is finally displayed as requestPersecond as per the change in HBASE-3807 But in master the same is calculated from HServerLoad HRegionServer.buildServerLoad() {code} new HServerLoad(requestCount.get(), (int)(memory.getUsed() / 1024 / 1024), (int) (memory.getMax() / 1024 / 1024), regionLoads) {code} Request counter is present in HregionServer {code} // Request counter. // Do we need this? Can't we just sum region counters? St.Ack 20110412 private AtomicInteger requestCount = new AtomicInteger(); {code} Obtained form the request counter which is incremented in all the API's of HRegionServer {color:red}This is not calculated per second its representing the total request per second.{color} but still in the master page we claim {color:green}Load is requests per second and count of regions loaded.{color} This promted me in changing the convention from request to resquestPerSecond {color:green}Ideally The fix should be calculating the requestpersecond at region server and initializing the HServerLoad with that value and the same to be displayed in the master.{color} Region Servers Address Start Code Load linux-kxjl:60030 1313659887824linux-kxjl,60020,1313659887824 requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=26, maxHeapMB=995 Total: servers: 1 requests=0, regions=2 Load is requests per second and count of regions loaded Also its better to change the agregation details also into the new convention {color:red} requests=0, regions=2{color} to {color:green} requestsPerSecond=0, numberOfOnlineRegions=2{color} If this looks fine i can provide a patch for the same. RS requestsPerSecond counter seems to be off Key: HBASE-4215 URL: https://issues.apache.org/jira/browse/HBASE-4215 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087012#comment-13087012 ] Ted Yu commented on HBASE-4213: --- I wonder if the following check is accurate because servers and rsCount are both returned from ZKUtil methods: {code} +int rsCount = ZKUtil.getNumberOfChildren(this.watcher, path); +if (servers != null servers.size() = rsCount) { {code} In TestInstantSchemaChange: {code} miniHBaseCluster = TEST_UTIL.startMiniCluster(3); {code} I added rsCount in your log statement. Then I saw the following in org.apache.hadoop.hbase.client.TestInstantSchemaChange-output.txt: {code} tyumac:trunk tyu$ grep ' region servers have successfully ' target/surefire-reports/org.apache.hadoop.hbase.client.TestInstantSchemaChange-output.txt 2011-08-17 21:54:54,995 DEBUG [main-EventThread] zookeeper.MasterSchemaChangeTracker(69): All 2 region servers have successfully processed the schema changes for table = testSchemachange . Deleting the schema change node = /hbase/schema/testSchemachange 2011-08-17 21:54:57,459 DEBUG [main-EventThread] zookeeper.MasterSchemaChangeTracker(69): All 2 region servers have successfully processed the schema changes for table = testSchemachangeForAddColumn . Deleting the schema change node = /hbase/schema/testSchemachangeForAddColumn 2011-08-17 21:54:58,120 DEBUG [main-EventThread] zookeeper.MasterSchemaChangeTracker(69): All 2 region servers have successfully processed the schema changes for table = testSchemachangeForModifyColumn . Deleting the schema change node = /hbase/schema/testSchemachangeForModifyColumn 2011-08-17 21:54:58,330 DEBUG [main-EventThread] zookeeper.MasterSchemaChangeTracker(69): All 3 region servers have successfully processed the schema changes for table = testSchemachangeNode . Deleting the schema change node = /hbase/schema/testSchemachangeNode {code} Looks like master might delete schema change node prematurely. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off
[ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087014#comment-13087014 ] subramanian raghunathan commented on HBASE-4215: In further to these understandindg , i was planning to unify the methodology of deriving the requestPerSecond , so that HRsgionServer uses the APi of RegionServerMetrics (i.e) RegionServerMetrics.getRequests() and initialise HServerLoad , but one desparity found was metricsState uses a float value whereas HServerLoad uses an integer value to represent the same,It would be better to unify the data type across them so that the master and region server UI are in sink. Please sugest me which of them can be used and unified,i personally feel float is better but the changes seem to be more. Based on that i will provide the patch tommorow. RS requestsPerSecond counter seems to be off Key: HBASE-4215 URL: https://issues.apache.org/jira/browse/HBASE-4215 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4015) Refactor the TimeoutMonitor to make it less racy
[ https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087051#comment-13087051 ] ramkrishna.s.vasudevan commented on HBASE-4015: --- @Stack Completed the overall changes with OFFLINE state.(I think so) :) Will upload the patch tomorrow for your review. Thanks in advance. Refactor the TimeoutMonitor to make it less racy Key: HBASE-4015 URL: https://issues.apache.org/jira/browse/HBASE-4015 Project: HBase Issue Type: Sub-task Affects Versions: 0.90.3 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.92.0 Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state diagrams.pdf The current implementation of the TimeoutMonitor acts like a race condition generator, mostly making things worse rather than better. It does it's own thing for a while without caring for what's happening in the rest of the master. The first thing that needs to happen is that the regions should not be processed in one big batch, because that sometimes can take minutes to process (meanwhile a region that timed out opening might have opened, then what happens is it will be reassigned by the TimeoutMonitor generating the never ending PENDING_OPEN situation). Those operations should also be done more atomically, although I'm not sure how to do it in a scalable way in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4223) Support the ability to return a set of rows using Coprocessors
[ https://issues.apache.org/jira/browse/HBASE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087091#comment-13087091 ] Andrew Purtell commented on HBASE-4223: --- Another option could be to spill the result into a temporary file in the region and then allow clients to scan over them using the normal scanner interface, setting some attribute on the Scan object to indicate what you want? There wouldn't be a result set size limitation in this case. Support the ability to return a set of rows using Coprocessors -- Key: HBASE-4223 URL: https://issues.apache.org/jira/browse/HBASE-4223 Project: HBase Issue Type: Improvement Components: coprocessors Affects Versions: 0.92.0 Reporter: Nichole Treadway Priority: Minor Attachments: HBASE-4223.patch Currently HBase supports returning the results of aggregation operations using coprocessors with the AggregationClient. It would be useful to include a client and implementation which would return a set of rows which match a certain criteria using coprocessors as well. We have a use case in our business process for this. We have an initial implementation of this, which I've attached. The only limitation that we've found is that it cannot be used to return very large sets of rows. If the result set is very large, it would probably require some sort of pagination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4222) Make HLog more resilient to write pipeline failures
[ https://issues.apache.org/jira/browse/HBASE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087093#comment-13087093 ] Andrew Purtell commented on HBASE-4222: --- +1 We've tested this on EC2 clusters and it works. Make HLog more resilient to write pipeline failures --- Key: HBASE-4222 URL: https://issues.apache.org/jira/browse/HBASE-4222 Project: HBase Issue Type: Improvement Components: wal Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 The current implementation of HLog rolling to recover from transient errors in the write pipeline seems to have two problems: # When {{HLog.LogSyncer}} triggers an {{IOException}} during time-based sync operations, it triggers a log rolling request in the corresponding catch block, but only after escaping from the internal while loop. As a result, the {{LogSyncer}} thread will exit and never be restarted from what I can tell, even if the log rolling was successful. # Log rolling requests triggered by an {{IOException}} in {{sync()}} or {{append()}} never happen if no entries have yet been written to the log. This means that write errors are not immediately recovered, which extends the exposure to more errors occurring in the pipeline. In addition, it seems like we should be able to better handle transient problems, like a rolling restart of DataNodes while the HBase RegionServers are running. Currently this will reliably cause RegionServer aborts during log rolling: either an append or time-based sync triggers an initial {{IOException}}, initiating a log rolling request. However the log rolling then fails in closing the current writer (All datanodes are bad), causing a RegionServer abort. In this case, it seems like we should at least allow you an option to continue with the new writer and only abort on subsequent errors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4222) Make HLog more resilient to write pipeline failures
[ https://issues.apache.org/jira/browse/HBASE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087094#comment-13087094 ] Andrew Purtell commented on HBASE-4222: --- I presume a patch or RB post is coming soon. :-) Make HLog more resilient to write pipeline failures --- Key: HBASE-4222 URL: https://issues.apache.org/jira/browse/HBASE-4222 Project: HBase Issue Type: Improvement Components: wal Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 The current implementation of HLog rolling to recover from transient errors in the write pipeline seems to have two problems: # When {{HLog.LogSyncer}} triggers an {{IOException}} during time-based sync operations, it triggers a log rolling request in the corresponding catch block, but only after escaping from the internal while loop. As a result, the {{LogSyncer}} thread will exit and never be restarted from what I can tell, even if the log rolling was successful. # Log rolling requests triggered by an {{IOException}} in {{sync()}} or {{append()}} never happen if no entries have yet been written to the log. This means that write errors are not immediately recovered, which extends the exposure to more errors occurring in the pipeline. In addition, it seems like we should be able to better handle transient problems, like a rolling restart of DataNodes while the HBase RegionServers are running. Currently this will reliably cause RegionServer aborts during log rolling: either an append or time-based sync triggers an initial {{IOException}}, initiating a log rolling request. However the log rolling then fails in closing the current writer (All datanodes are bad), causing a RegionServer abort. In this case, it seems like we should at least allow you an option to continue with the new writer and only abort on subsequent errors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3807) Fix units in RS UI metrics
[ https://issues.apache.org/jira/browse/HBASE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3807: - Assignee: subramanian raghunathan Assigning to Subramanian. Fix units in RS UI metrics -- Key: HBASE-3807 URL: https://issues.apache.org/jira/browse/HBASE-3807 Project: HBase Issue Type: Bug Reporter: stack Assignee: subramanian raghunathan Fix For: 0.92.0 Attachments: HBASE-3807_trunk.patch Currently the metrics are a mix of MB and bytes. Its confusing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087110#comment-13087110 ] Subbu M Iyer commented on HBASE-4213: - They both are checking different counts. One counts the number of active RS at this moment (based on number of childrens of /hbase/rs) and other counts the number of RS successfully processed the schema change (based on number of childrens of /hbase/schema/table name ). If number of RS who have processed the schema change is = number of active RS at this moment then master presumes that all RS have acknowledged the schema change and hence goes ahead with delete operation. On Thu, Aug 18, 2011 at 6:45 AM, Ted Yu (JIRA) j...@apache.org wrote: Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4223) Support the ability to return a set of rows using Coprocessors
[ https://issues.apache.org/jira/browse/HBASE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087111#comment-13087111 ] stack commented on HBASE-4223: -- This patch is very interesting. It seems to be adding new 'functionality' to coprocessors. Sweet. nit: lines lengths in the rest of the code base are generally 80 chars and tabs are two spaces. This is a sort of cache Nichole? {code} this.kvs = finalResults.toArray(new KeyValue[results.size()]); {code} ... so you'd have to do a getList before you'd get anything out of a getMap? Is this interdependency doc'd anywhere? (Similar with getNoVersionMap and dependency on getList -- isEmpty too) I was wondering about the getMap running serverside? Maybe just do this in client? We're going to have to shuttle the payload across regardless. Yeah, paging though I'd suggest the paging be size based rather than element count. I don't know much about how CPs work but is there means of keeping state between client and server (with resources let go if client goes away?) Support the ability to return a set of rows using Coprocessors -- Key: HBASE-4223 URL: https://issues.apache.org/jira/browse/HBASE-4223 Project: HBase Issue Type: Improvement Components: coprocessors Affects Versions: 0.92.0 Reporter: Nichole Treadway Priority: Minor Attachments: HBASE-4223.patch Currently HBase supports returning the results of aggregation operations using coprocessors with the AggregationClient. It would be useful to include a client and implementation which would return a set of rows which match a certain criteria using coprocessors as well. We have a use case in our business process for this. We have an initial implementation of this, which I've attached. The only limitation that we've found is that it cannot be used to return very large sets of rows. If the result set is very large, it would probably require some sort of pagination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087112#comment-13087112 ] Ted Yu commented on HBASE-4213: --- Thanks for the explanation. Tracing where path variable {code} +int rsCount = ZKUtil.getNumberOfChildren(this.watcher, path); {code} comes from: {code} +if (path.startsWith(watcher.schemaZNode) {code} It doesn't align with your description. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4015) Refactor the TimeoutMonitor to make it less racy
[ https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087120#comment-13087120 ] stack commented on HBASE-4015: -- @Ram You are a good man. Refactor the TimeoutMonitor to make it less racy Key: HBASE-4015 URL: https://issues.apache.org/jira/browse/HBASE-4015 Project: HBase Issue Type: Sub-task Affects Versions: 0.90.3 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.92.0 Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state diagrams.pdf The current implementation of the TimeoutMonitor acts like a race condition generator, mostly making things worse rather than better. It does it's own thing for a while without caring for what's happening in the rest of the master. The first thing that needs to happen is that the regions should not be processed in one big batch, because that sometimes can take minutes to process (meanwhile a region that timed out opening might have opened, then what happens is it will be reassigned by the TimeoutMonitor generating the never ending PENDING_OPEN situation). Those operations should also be done more atomically, although I'm not sure how to do it in a scalable way in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087156#comment-13087156 ] Lars Hofhansl commented on HBASE-4071: -- Again sorry for he review churn. Updates were posted even after I removed the jira from the review :( Quick update... Since there are so many actors involved in this (Store, Region, ScanQueryMatcher, ColumnTrackers), all with slightly different intricate logic, I think abstracting this out into an interface will either not make it nicer, or require me to rewrite the entire logic. Instead I unified both the TTL and Versioning logic inside the ColumnTrackers, while still giving the trackers a chance to bail out early. That made it simpler, and will hopefully make it easier in the future to abstract this further (I think that needs to be coordinated with the Compaction Coprocessor work). One problem I encountered is Store.getRowKeyAtOrBefore. That currently honors TTL but not MaxVersions (which is strange). I'm thinking I'll either leave that alone, or have it also not honor TTL when the store has minversions. Fixing this correctly in all cases would mean to scan all relevant KVs in the Memstore (i.e. ignoring TTL and version restrictions), then use those candidates to scan the storefiles (now honoring TTL and doing the version counting). Added a new test that validates the basic behavior. Running tests now. (I seems to have a hard time to get a full test run through locally - with or without my patch). Will attach a new patch soon that should still be considered a sketch but should hold up to a bit more scrutiny. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()
[ https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087165#comment-13087165 ] stack commented on HBASE-4175: -- +1 Fix FSUtils.createTableDescriptor() --- Key: HBASE-4175 URL: https://issues.apache.org/jira/browse/HBASE-4175 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: ramkrishna.s.vasudevan Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch Currently createTableDescriptor() doesn't return anything. The caller wouldn't know whether the descriptor is created or not. See exception handling: {code} } catch(IOException ioe) { LOG.info(IOException while trying to create tableInfo in HDFS, ioe); } {code} We should return a boolean. If the table descriptor exists already, maybe we should deserialize from hdfs and compare with htableDescriptor argument. If they differ, I am not sure what the proper action would be. Maybe we can add a boolean argument, force, to createTableDescriptor(). When force is true, existing table descriptor would be overwritten. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4225) NoSuchColumnFamilyException in multi doesn't say which family is bad
NoSuchColumnFamilyException in multi doesn't say which family is bad Key: HBASE-4225 URL: https://issues.apache.org/jira/browse/HBASE-4225 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.90.5 It's kind of a dumb one, in HRegion.doMiniBatchPut we do: {code} LOG.warn(No such column family in batch put, nscf); batchOp.retCodes[lastIndexExclusive] = OperationStatusCode.BAD_FAMILY; {code} So we lose the family here, all we know is there's a bad one, that's what's in HRS.multi: {code} } else if (code == OperationStatusCode.BAD_FAMILY) { result = new NoSuchColumnFamilyException(); {code} We can't just throw the exception like that, we need to say which one is bad even if it requires testing all passed MultiActions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()
[ https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087219#comment-13087219 ] Ted Yu commented on HBASE-4175: --- Integrated to TRUNK. Thanks for the patch Ramkrishna. Thanks for the review Stack. Fix FSUtils.createTableDescriptor() --- Key: HBASE-4175 URL: https://issues.apache.org/jira/browse/HBASE-4175 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: ramkrishna.s.vasudevan Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch Currently createTableDescriptor() doesn't return anything. The caller wouldn't know whether the descriptor is created or not. See exception handling: {code} } catch(IOException ioe) { LOG.info(IOException while trying to create tableInfo in HDFS, ioe); } {code} We should return a boolean. If the table descriptor exists already, maybe we should deserialize from hdfs and compare with htableDescriptor argument. If they differ, I am not sure what the proper action would be. Maybe we can add a boolean argument, force, to createTableDescriptor(). When force is true, existing table descriptor would be overwritten. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087233#comment-13087233 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java, line 12 bq. https://reviews.apache.org/r/1214/diff/13/?file=32972#file32972line12 bq. bq. Please add javadoc for these methods. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java, line 92 bq. https://reviews.apache.org/r/1214/diff/13/?file=32974#file32974line92 bq. bq. This shows that Cacheable can reside in on-heap cache. bq. The description for Cacheable should be refined. Done. Cacheable can reside in either. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 79 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line79 bq. bq. HeapSize is covered by Cacheable so is not needed here. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 1528 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line1528 bq. bq. Indentation is incorrect here. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 1535 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line1535 bq. bq. Since HFileBlock implements Cacheable, people may get confused by what 'selfWithoutByteBuffer' means. bq. Please add javadoc. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 1540 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line1540 bq. bq. Please add javadoc for what this method does. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 1541 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line1541 bq. bq. Whitespace. Done. bq. On 2011-08-17 04:44:54, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java, line 1554 bq. https://reviews.apache.org/r/1214/diff/13/?file=32975#file32975line1554 bq. bq. Indentation. Done. - Li --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1489 --- On 2011-08-18 08:31:59, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-18 08:31:59) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.
[jira] [Created] (HBASE-4226) HFileBlock.java style cleanup.
HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087244#comment-13087244 ] Doug Meil commented on HBASE-4199: -- Great! Do this need more +1's, or can it be committed? blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch This is the backend work for the blockCache summary. Change to BlockCache interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to HRegionInterface, and HRegionServer. This will NOT include any of the web UI or anything else like that. That is for another sub-task. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087254#comment-13087254 ] stack commented on HBASE-4095: -- Thank you for your persistence Jieshan on trying to get this patch in. I'm basically +1 on v9. Below is a small question. {code} -this.logRollRequested = false; +this.logRollRunning = false; {code} Should the setting of this.logRollRunning be set instead in a finally block in here in rollWriter? If an exception thrown after we set the logRollRunning, it looks like logRollRunning could stay set. My guess is that not doing this would probably not be noticed in that we probably crash out the regionserver if a rollWriter fails but having the flag stuck set might make for some unexpected state? So, it looks like we'll roll 5 times by default before we'll turn off the low replication log rolling facility -- which is better than a log per sync, right? You seem to lose some 'liveness' regards file replication setting. In code before this patch, when rollwriter ran, it'd ask the FS what the replication on the new writer is and that going forward would be the replication to use. See here: {code} - int nextInitialReplication = fs.getFileStatus(newPath).getReplication(); {code} Instead you set the replication once on instantiation of HLog. See here: {code} +this.minTolerableReplication = conf.getInt( +hbase.regionserver.hlog.tolerable.lowreplication, +this.fs.getDefaultReplication()); {code} .. and rather than ask the files replication you use the filesystem default value. Do you have a good reason for changing this behavior? Thanks Jieshan. Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087260#comment-13087260 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1525 --- src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java https://reviews.apache.org/r/1214/#comment3476 Whitespace. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1214/#comment3477 Whitespace. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java https://reviews.apache.org/r/1214/#comment3478 Auto-formatter weirdness. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java https://reviews.apache.org/r/1214/#comment3480 this and all other formatting fixes should probably be in a seperate jira. filed. - Li On 2011-08-18 08:31:59, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-18 08:31:59) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 bq. src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 bq. bq. Diff: https://reviews.apache.org/r/1214/diff bq. bq. bq. Testing bq. --- bq. bq. Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. bq. bq. bq. Thanks, bq. bq. Li bq. bq. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff,
[jira] [Commented] (HBASE-4008) Problem while stopping HBase
[ https://issues.apache.org/jira/browse/HBASE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087259#comment-13087259 ] stack commented on HBASE-4008: -- Akash: My motivation is keeping the public API as narrow as possible. I'll make it protected on commit and then if someone else makes good arg. that they need it public, we'll deal then. Does this patch work for you? If so, I'll commit w/ above change. Problem while stopping HBase Key: HBASE-4008 URL: https://issues.apache.org/jira/browse/HBASE-4008 Project: HBase Issue Type: Bug Components: scripts Reporter: Akash Ashok Assignee: Akash Ashok Labels: HMaster Fix For: 0.92.0 Attachments: HBase-4008-v2.patch, HBase-4008.patch stop-hbase.sh stops the server successfully if and only if the server is instantiated properly. When u Run start-hbase.sh; sleep 10; stop-hbase.sh; ( This works totally fine and has no issues ) Whereas when u run start-hbase.sh; stop-hbase.sh; ( This never stops the server and neither the server gets initialized and starts properly ) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4226: - Attachment: hbase-4226.diff HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4226: - Status: Patch Available (was: Open) just a simple code cleanup. HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087298#comment-13087298 ] Todd Lipcon commented on HBASE-4226: A lot of these changes actually make it harder to read. I'm all for fixing places where we have incorrect style (eg braces on the wrong line) but I don't think it's worth rewrapping existing code that fits the style guidelines. (eg a lot of the javadoc changes are just re-justifying) some examples: -if (uncompressedSizeWithoutHeader != -uncompressedBytesWithHeader.length - HEADER_SIZE) { +if (uncompressedSizeWithoutHeader != uncompressedBytesWithHeader.length +- HEADER_SIZE) { (harder to read after) -InputStream bufferedBoundedStream = createBufferedBoundedStream( -offset, onDiskSize, pread); +InputStream bufferedBoundedStream = createBufferedBoundedStream(offset, +onDiskSize, pread); IMO I think the original was better -ByteBuffer headerBuf = prefetchedHeader.offset == offset ? -prefetchedHeader.buf : null; +ByteBuffer headerBuf = prefetchedHeader.offset == offset ? prefetchedHeader.buf +: null; same HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087302#comment-13087302 ] stack commented on HBASE-4199: -- Here's some feedback. This is very nice funcionality. Thanks for adding it. The constructor that takes no args has empty javadoc. Remove it or fill it in (perhaps note this method is for deserializing only) No need to assign to table and columnfamily on declaration? Let a NPE out if this class ill-initialized You are inconsistent w/ lines separating methods. Looks sloppy (Usually we put a single line between methods) Whats up with this BlockCacheSummaryEntry? Its a summary or is it an entry? If a summary, its odd that I can set things like blocks and heapSize post construction and even table and columnfamily. Maybe this should be an immmutable object where you pass in all variables on construction? I suppose its a summary for the passed table and cf? Hmm. Thats what you say in the class javadoc so strike the above rumination on class name (the whether it should be immutable comment stands). +this.heapSize = this.heapSize + heapSize; is usually written this.heapSize += heapSize... but no biggie. In the below {code} + public void readFields(DataInput arg0) throws IOException { {code} 'arg0' is a bad name for a param (Same on the write). In hashCode you do: {code} +final int prime = 31; +int result = 1; +result = prime * result ... {code} Why the 'int result = 1' at all? Why not 'int result = prime + ' I like your equals implementation. In your toString, why include name of this class? Should we throw exception if path does not have four elements in it? Would suggest you add the stuff you have in comments down in createFromStoreFilePath up into the javadoc; would help clarify the kinda path we are expecting. You are inconsistent with spacings below: {code} + String table = s[ s.length - 4]; // 4th from the end + String cf = s[ s.length - 2]; // 2nd from the end {code} Be careful w/ line lengths. Usually 80. How often is getBlockCacheSummary called? For each invocation we do inspection of all under hbase.rootdir? This seems like a pretty costly operation. Looking at getTableStoreFilePathMap, we should make a storefiles filter. Seems like it'd get used more than once (but thats for another JIRA). This comment looks wrong? Under each of these, should be one file only. Why do this: {code} +BlockCacheSummaryEntry e2 = new BlockCacheSummaryEntry(); +e2.setTable(e.getTable()); +e2.setColumnFamily(e.getColumnFamily()); +return e2; {code} Why not do return BlockCacheSummaryEntry(e.getTable(), e.getColumnFamily()); The below should be a Set? {code} +MapBlockCacheSummaryEntry, BlockCacheSummaryEntry bcs = + new HashMapBlockCacheSummaryEntry, BlockCacheSummaryEntry(); {code} Or, strike that... I see how you are using it. Its a little unusual that equality is on only two of the datamembers. Nothing wrong with this but I'd call this out in the class comment for this class, that two instances are compared the same if table+cf agree (though counts differ). I like doing if (s.length = 0) continue rather than below. {code} + if (s.length 0) { {code} Advantage of former is that save an indentation. Similar for the if (path != null).. I'd rather do if (path == null) continue; In javadoc should you should say that getBlockCacheSummary returns a list sorted by table name + cf (or you don't want to have sort in contract for this method?) Why do this: {code} +BlockCacheSummaryEntry[] ar = new BlockCacheSummaryEntry[list.size()]; +for (int i = 0; i list.size(); i++) { + ar[i]=list.get(i); +} +return ar; {code} Why not return the List? (In future, doing something like above, you can use http://download.oracle.com/javase/6/docs/api/java/util/List.html#toArray(T[])) I see now that this is a feature use with low frequency so the fact that it is heavyweight should be fine. You might add this to javadoc though that it includes scan of fs Why the double javadoc? You have 'Performs block cache summary' but then you also have @Override (we will pick up the javadoc from the interface so the extra stuff in here from HREgionServer is not needed). Tests look good. Just remove the below: {code} + public void setUp() { + } {code} Remove javadoc w/ nothing in it. Looks bad. Good stuff Doug. blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch This is the backend work for the
[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off
[ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087313#comment-13087313 ] stack commented on HBASE-4215: -- +1 on changing HSL to do float. RS requestsPerSecond counter seems to be off Key: HBASE-4215 URL: https://issues.apache.org/jira/browse/HBASE-4215 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4215) RS requestsPerSecond counter seems to be off
[ https://issues.apache.org/jira/browse/HBASE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087314#comment-13087314 ] stack commented on HBASE-4215: -- There should be no migration issue since these are transient objects. RS requestsPerSecond counter seems to be off Key: HBASE-4215 URL: https://issues.apache.org/jira/browse/HBASE-4215 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 In testing trunk, I had YCSB reporting some 40,000 requests/second, but the summary info on the master webpage was consistently indicating somewhere around 3x that. I'm guessing that we may have a bug where we forgot to divide by time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()
[ https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087325#comment-13087325 ] Hudson commented on HBASE-4175: --- Integrated in HBase-TRUNK #2124 (See [https://builds.apache.org/job/HBase-TRUNK/2124/]) HBASE-4175 Fix FSUtils.createTableDescriptor() tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java Fix FSUtils.createTableDescriptor() --- Key: HBASE-4175 URL: https://issues.apache.org/jira/browse/HBASE-4175 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: ramkrishna.s.vasudevan Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch Currently createTableDescriptor() doesn't return anything. The caller wouldn't know whether the descriptor is created or not. See exception handling: {code} } catch(IOException ioe) { LOG.info(IOException while trying to create tableInfo in HDFS, ioe); } {code} We should return a boolean. If the table descriptor exists already, maybe we should deserialize from hdfs and compare with htableDescriptor argument. If they differ, I am not sure what the proper action would be. Maybe we can add a boolean argument, force, to createTableDescriptor(). When force is true, existing table descriptor would be overwritten. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087337#comment-13087337 ] Li Pi commented on HBASE-4226: -- Gotcha. Will fix. https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087298#comment-13087298] places where we have incorrect style (eg braces on the wrong line) but I don't think it's worth rewrapping existing code that fits the style guidelines. (eg a lot of the javadoc changes are just re-justifying) prefetchedHeader.buf HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3752) Tool to replay moved-aside WAL logs
[ https://issues.apache.org/jira/browse/HBASE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3752: - Attachment: walplayer.rb Here is another version that uses replication replay of edits code instead. Testing (Don't worry Todd, will redo as a java once I've figured what works). Tool to replay moved-aside WAL logs --- Key: HBASE-3752 URL: https://issues.apache.org/jira/browse/HBASE-3752 Project: HBase Issue Type: Task Reporter: stack Priority: Critical Attachments: walplayer.rb, walplayer.rb We need this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087369#comment-13087369 ] stack commented on HBASE-4213: -- bq. If number of RS who have processed the schema change is = number of active RS at this moment then master presumes that all RS have acknowledged the schema change and hence goes ahead with delete operation. If I add ten regionservers to a cluster in between start of a schema change and before it completes, could 'if (servers != null servers.size() = rsCount)' never trigger? Just remove this: {code} # Table should be disabled - raise(ArgumentError, Table #{table_name} is enabled. Disable it first before altering.) if enabled?(table_name) + # raise(ArgumentError, Table #{table_name} is enabled. Disable it first before altering.) if enabled?(table_name) {code} ... including the comment. You should log the exception... add it as arg to below: {code} + } catch (KeeperException e) { +LOG.warn(Instant schema change failed for table + tableName ); + } {code} You do this: {code} +this.schemaChangeTracker = new MasterSchemaChangeTracker(getZooKeeper(), +this.zooKeeper.schemaZNode, this); {code} Do you have to pass schema znode? Can you not ask the object returned by getZooKeeper for schemaZNode so only need to have two args in this method? Whats this? {code} + public void checkTableModifiable(final byte [] tableName, + EventHandler.EventType eventType) throws IOException { +preCheckTableModifiable(tableName); +if (!eventType.isSchemaChangeEvent()) { + if (!getAssignmentManager().getZKTable(). + isDisabledTable(Bytes.toString(tableName))) { +throw new TableNotDisabledException(tableName); + } +} + } {code} Whats the is disabled check for? We need this any more? Oh, I see... its needed if its NOT a schema change event. Javadoc says '+ * @return true if region is opened successfully with new schema changes.' but method returns void. Does this need to be synchronized (you syncrhonize on it later in a different method): + for (HRegion hRegion : onlineRegionsOfTable) { If a new region for this table comes in meantime, it'll be ok? It'll find new schema? What happens if master decides to balance a region at this time? Or disable it? While reopenRegions is running? What Ted says on delete cf. Can you look in zk or something rather than wait on a timer before moving on? {code} +// Take a mini nap for changes to take effect. +try { + Thread.sleep(200); +} catch (InterruptedException e) { +} {code} Remove the '*' from log messages. You'll start an arms race where every log message will try and add a new type of flashing glyph so it can stand out from the crowd. This patch looks great. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087375#comment-13087375 ] Lars Hofhansl commented on HBASE-4071: -- Please have a look at the review now (the same one), and let me know what you think. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087381#comment-13087381 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/ --- (Updated 2011-08-18 23:01:12.689967) Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. Summary (updated) --- Allow enforcing a minimum number of versions when TTL is enable for a store. The GC logic for both versions and TTL is unified inside the ColumnTrackers. This addresses bug HBASE-4071. https://issues.apache.org/jira/browse/HBASE-4071 Diffs - http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 Diff: https://reviews.apache.org/r/1582/diff Testing --- Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. New tests: TestMinVersions. Thanks, Lars Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087379#comment-13087379 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/ --- (Updated 2011-08-18 22:58:25.647922) Review request for hbase and Ian Varley. Summary (updated) --- Allow enforcing a minimum number of versions when TTL is enable for a store. The GC logic for both versions and TTL is unified inside the ColumnTrackers. This addresses bug HBASE-4071. https://issues.apache.org/jira/browse/HBASE-4071 Diffs (updated) - http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 Diff: https://reviews.apache.org/r/1582/diff Testing (updated) --- Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. New tests: TestMinVersions. Thanks, Lars Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087380#comment-13087380 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/ --- (Updated 2011-08-18 22:59:37.165069) Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. Summary --- Allow enforcing a minimum number of versions when TTL is enable for a store. The GC logic for both versions and TTL is unified inside the ColumnTrackers. This addresses bug HBASE-4071. https://issues.apache.org/jira/browse/HBASE-4071 Diffs - http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 Diff: https://reviews.apache.org/r/1582/diff Testing --- Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. New tests: TestMinVersions. Thanks, Lars Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087383#comment-13087383 ] Subbu M Iyer commented on HBASE-4213: - Ted, Yes, you are absolutely right and very good catch. The second check should be like: int rsCount = ZKUtil.getNumberOfChildren(this.watcher, watcher.rsZNode); instead of int rsCount = ZKUtil.getNumberOfChildren(this.watcher, path); I have taken care of this. thanks Subbu Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087384#comment-13087384 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/#review1536 --- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java https://reviews.apache.org/r/1582/#comment3495 Will fix the white space http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/1582/#comment3496 This means that expired rows are not removed on flush when minVersions is set. That is because at this point we do not have enough information. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/1582/#comment3498 Note the interface change here. KeyValue, could have a timestamp, in which case we'd look for that particular version. No caller used this, and passing byte[] avoids that problem completely. - Lars On 2011-08-18 23:01:12, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1582/ bq. --- bq. bq. (Updated 2011-08-18 23:01:12) bq. bq. bq. Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. bq. bq. bq. Summary bq. --- bq. bq. Allow enforcing a minimum number of versions when TTL is enable for a store. bq. The GC logic for both versions and TTL is unified inside the ColumnTrackers. bq. bq. bq. This addresses bug HBASE-4071. bq. https://issues.apache.org/jira/browse/HBASE-4071 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 bq. bq. Diff: https://reviews.apache.org/r/1582/diff bq. bq. bq. Testing bq. --- bq. bq. Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. bq. New tests: TestMinVersions. bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if
[jira] [Updated] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4226: - Attachment: hbase-4226v2.diff far more conservative cleanup. HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff, hbase-4226v2.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087388#comment-13087388 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/#review1537 --- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/1582/#comment3503 HRegion.getClosestRowBefore() currently calls this method. - Ted On 2011-08-18 23:01:12, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1582/ bq. --- bq. bq. (Updated 2011-08-18 23:01:12) bq. bq. bq. Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. bq. bq. bq. Summary bq. --- bq. bq. Allow enforcing a minimum number of versions when TTL is enable for a store. bq. The GC logic for both versions and TTL is unified inside the ColumnTrackers. bq. bq. bq. This addresses bug HBASE-4071. bq. https://issues.apache.org/jira/browse/HBASE-4071 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 bq. bq. Diff: https://reviews.apache.org/r/1582/diff bq. bq. bq. Testing bq. --- bq. bq. Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. bq. New tests: TestMinVersions. bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1730) Near-instantaneous online schema and table state updates
[ https://issues.apache.org/jira/browse/HBASE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087387#comment-13087387 ] jirapos...@reviews.apache.org commented on HBASE-1730: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/#review1535 --- I reviewed about half. Nice patch. Thank you. One feature that hbase-4213 has is that the state of the alter is tracked up in zk. 4213 doesn't do this but, going via zk would seem to make the transaction more robust against failure of master. What happens in this patch if master is shot in the head mid-way threw the alter? What you think Nileema? src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java https://reviews.apache.org/r/1479/#comment3489 White space. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java https://reviews.apache.org/r/1479/#comment3490 Why catch exception to just rethrow it? src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java https://reviews.apache.org/r/1479/#comment3491 White space src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java https://reviews.apache.org/r/1479/#comment3492 You are adding white space. src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java https://reviews.apache.org/r/1479/#comment3493 Javadoc does not agree w/ what is being actually returned. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/1479/#comment3494 HSI is deprecated. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/1479/#comment3497 White space. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/1479/#comment3499 Why are we waiting between unassigns? To lessen load on cluster? src/main/java/org/apache/hadoop/hbase/master/HMaster.java https://reviews.apache.org/r/1479/#comment3500 Is there anything in here to stop someone disabling a table or preventing the load balancer moving regions while this is all going on? src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java https://reviews.apache.org/r/1479/#comment3501 This looks like you are fixing a bug we have in hbase? src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java https://reviews.apache.org/r/1479/#comment3502 Missing curly braces - Michael On 2011-08-12 06:14:21, Nileema Shingte wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1479/ bq. --- bq. bq. (Updated 2011-08-12 06:14:21) bq. bq. bq. Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. bq. bq. bq. Summary bq. --- bq. bq. When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: bq. bq. 1. Changes to reopen the regions when any of the above operations are performed. bq. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. bq. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. bq. 4. alter command in the hbase shell will block until all the regions are updated, providing a status x/y regions updated every second. bq. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. bq. 6. alter_status table_name which is a sync call and blocks to provide the x/y regions updated status per second until all regions are updated. bq. 7. modification in the unit test for enabling alter without disabling the table. bq. bq. bq. This addresses bug HBASE-1730. bq. https://issues.apache.org/jira/browse/HBASE-1730 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77 bq.src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c bq.src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c bq.src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 bq.src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140 bq. src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087393#comment-13087393 ] Ted Yu commented on HBASE-4213: --- @Subbu: Glad I was able to help. I think nothing is absolute. I believe in relativity :-) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087400#comment-13087400 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- bq. On 2011-08-18 23:15:00, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 1277 bq. https://reviews.apache.org/r/1582/diff/2/?file=33571#file33571line1277 bq. bq. HRegion.getClosestRowBefore() currently calls this method. Sorry what I meant to say is that no caller makes use of passing a backdated KeyValue to this method. Changing the signature here makes sure that nobody will. The caller in HRegion is also changed as part of this patch. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/#review1537 --- On 2011-08-18 23:01:12, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1582/ bq. --- bq. bq. (Updated 2011-08-18 23:01:12) bq. bq. bq. Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. bq. bq. bq. Summary bq. --- bq. bq. Allow enforcing a minimum number of versions when TTL is enable for a store. bq. The GC logic for both versions and TTL is unified inside the ColumnTrackers. bq. bq. bq. This addresses bug HBASE-4071. bq. https://issues.apache.org/jira/browse/HBASE-4071 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 bq. bq. Diff: https://reviews.apache.org/r/1582/diff bq. bq. bq. Testing bq. --- bq. bq. Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. bq. New tests: TestMinVersions. bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see:
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087406#comment-13087406 ] Ted Yu commented on HBASE-4213: --- One important decision we need to make about this feature is whether we support simultaneous schema changes for more than one table. If it is supported, master needs to maintain a set of region servers per table at the time schema change for that table starts. This way, master knows when the change can be deemed completed. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4145) Provide metrics for hbase client
[ https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087418#comment-13087418 ] Ming Ma commented on HBASE-4145: Ah, thanks for pointing out this, Stack. We can use this for #3. The ClientScanner will call scan.setAttribute with well-defined metrics property names. TableInputFormat will call scan.getAttribute to access the metrics values and pass onto MapReduce framework as counters. Provide metrics for hbase client Key: HBASE-4145 URL: https://issues.apache.org/jira/browse/HBASE-4145 Project: HBase Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Sometimes it is useful to get some metrics from hbase client point of view. This will help understand the metrics for scan/TableInputFormat map job scenario. What to capture, for example, for each ResultScanner object, 1. The number of RPC calls to RSs. 2. The delta time between consecutive RPC calls in the current serialized scan implementation. 3. The number of RPC retry to RSs. 4. The number of NotServingRegionException got. 5. The number of remote RPC calls. This excludes those call that hbase client calls the RS on the same machine. 6. The number of regions accessed. How to capture 1. Metrics framework works for a fixed number of metrics. It doesn't fit this scenario. 2. Use some TBD solution in HBase to capture such dynamic metrics. If we assume there is a solution in HBase that HBase client can use to log such kind of metrics, TableInputFormat can pass in mapreduce task ID as application scan ID to HBase client as small addition to existing scan API; and HBase client can log metrics accordingly with such ID. That will allow query, analysis later on the metrics data for specific map reduce job. 3. Expose via MapReduce counter. It lacks certain features, for example, there is no good way to access the metrics on per map instance; the MapReduce framework only performs sum on the counter values so it is tricky to find the max of certain metrics in all mapper instances. However, it might be good enough for now. With this approach, the metrics value will be available via MapReduce counter. a) Have ResultScanner return a new ResultScannerMetrics interface. b) TableInputFormat will access data from ResultScannerMetrics and populate MapReduce counters accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4227) Modify the webUI so that default values of column families are not shown
Modify the webUI so that default values of column families are not shown Key: HBASE-4227 URL: https://issues.apache.org/jira/browse/HBASE-4227 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: nileema shingte Priority: Minor Fix For: 0.92.0 With the introduction of online schema changes, it will become more advantageous to put configuration knobs at the column family level vs global configuration settings. This will create a nasty web UI experience for showing table properties unless we default to showing the custom values instead of all values. It's on the table if we want to modify the shell's 'describe' method as well. scan '.META.' should definitely return the full properties however. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4228) Add a method to get a list of HLog files for a RS.
Add a method to get a list of HLog files for a RS. -- Key: HBASE-4228 URL: https://issues.apache.org/jira/browse/HBASE-4228 Project: HBase Issue Type: New Feature Components: regionserver Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Trivial -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: (was: HBase-4095-V9-branch.patch) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: (was: HBase-4095-V9-trunk.patch) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog
[jira] [Created] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4095: Attachment: HBase-4095-V9-trunk.patch HBase-4095-V9-branch.patch {quote} Should the setting of this.logRollRunning be set instead in a finally block in here in rollWriter? If an exception thrown after we set the logRollRunning, it looks like logRollRunning could stay set. My guess is that not doing this would probably not be noticed in that we probably crash out the regionserver if a rollWriter fails but having the flag stuck set might make for some unexpected state? {quote} Yes, it's good suggestion. It should be put in the finally block. And I've changed it. {quote} So, it looks like we'll roll 5 times by default before we'll turn off the low replication log rolling facility - which is better than a log per sync, right? {quote} Yes. {quote} Do you have a good reason for changing this behavior? {quote} Maybe we misunderstood the method of : {noformat} int nextInitialReplication = fs.getFileStatus(newPath).getReplication(); {noformat} It always returns the default replications value(I've taken some tests to prove it. And I affirmed it from some hdfs experts). No matter how many live datanodes there and what's the actual replications. {noformat} this.minTolerableReplication = conf.getInt( +hbase.regionserver.hlog.tolerable.lowreplication, +this.fs.getDefaultReplication()); {noformat} So I added a new parameter hbase.regionserver.hlog.tolerable.lowreplication. Suppose the default replication value is 3. Before the patch, once the replications decreased, rollWriter should be triggered. To some extend, it's unreasonable. Because the rest 2 replication is also tolerable sometime. So I made it configurable. That's why I changed the behavior. Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434,
[jira] [Updated] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Riley Patterson updated HBASE-4229: --- Attachment: HBASE-4229.patch Also added jackson directly to our dependencies in pom.xml to guarantee that we get the same versions of the various jars. Avro and Jersey were bringing in different, incompatible versions and causing problems. Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial Attachments: HBASE-4229.patch HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version
[ https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087454#comment-13087454 ] jirapos...@reviews.apache.org commented on HBASE-4071: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1582/#review1542 --- http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java https://reviews.apache.org/r/1582/#comment3513 Still using HBaseTestCase here, because it gave me all the lowlevel stuff that I needed. If folks feel strongly I'll add what I need to HBaseTestingUtility and write jUnit4 tests instead. - Lars On 2011-08-18 23:01:12, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1582/ bq. --- bq. bq. (Updated 2011-08-18 23:01:12) bq. bq. bq. Review request for hbase, Todd Lipcon, Michael Stack, and Ian Varley. bq. bq. bq. Summary bq. --- bq. bq. Allow enforcing a minimum number of versions when TTL is enable for a store. bq. The GC logic for both versions and TTL is unified inside the ColumnTrackers. bq. bq. bq. This addresses bug HBASE-4071. bq. https://issues.apache.org/jira/browse/HBASE-4071 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1159317 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1159317 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1159317 bq. bq. Diff: https://reviews.apache.org/r/1582/diff bq. bq. bq. Testing bq. --- bq. bq. Ran all tests. I get error (not failures) from two: TestDistributedLogSplitting and TestHTablePool. Both fail with or without my changes. bq. New tests: TestMinVersions. bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Data GC: Remove all versions TTL EXCEPT the last written version -- Key: HBASE-4071 URL: https://issues.apache.org/jira/browse/HBASE-4071 Project: HBase Issue Type: New Feature Reporter: stack Attachments: MinVersions.diff We were chatting today about our backup cluster. What we want is to be able to restore the dataset from any point of time but only within a limited timeframe -- say one week. Thereafter, if the versions are older than one week, rather than as we do with TTL where we let go of all versions older than TTL, instead, let go of all versions EXCEPT the last one written. So, its like versions==1 when TTL one week. We want to allow that if an error is caught within a week of its happening -- user mistakenly removes a critical table -- then we'll be able to restore up the the moment just before catastrophe hit otherwise, we keep one version only. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4008) Problem while stopping HBase
[ https://issues.apache.org/jira/browse/HBASE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087465#comment-13087465 ] Akash Ashok commented on HBASE-4008: Ok sure This patch works for me. Thanks. Also about including the isAborted in the Abortable interface is there a JIRA already open for this? Could I take that up ? Problem while stopping HBase Key: HBASE-4008 URL: https://issues.apache.org/jira/browse/HBASE-4008 Project: HBase Issue Type: Bug Components: scripts Reporter: Akash Ashok Assignee: Akash Ashok Labels: HMaster Fix For: 0.92.0 Attachments: HBase-4008-v2.patch, HBase-4008.patch stop-hbase.sh stops the server successfully if and only if the server is instantiated properly. When u Run start-hbase.sh; sleep 10; stop-hbase.sh; ( This works totally fine and has no issues ) Whereas when u run start-hbase.sh; stop-hbase.sh; ( This never stops the server and neither the server gets initialized and starts properly ) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087469#comment-13087469 ] Doug Meil commented on HBASE-4199: -- 1. BlockCacheSummaryEntry name I'm struggling with the name. 'BlockCacheSummary' makes it seem like you should get exactly one object back (I.e., a summary). So what 'BlockCacheSummaryEntry' is intended to be is an entry in the summary. Is BlockCacheTableCfSummary a better class name? 2. RegionServer returning Lists Can the interface do this? I thought that everything returned had to implement Writable. If RegionServers can return Lists, then I think that's a better structure. 3. Eclipse-generated methods Equals, hashcode, and the writable signature (e.g., arg0), and toString were all generated by Eclipse. 4. Cost of method Yeah, this is not an inexpensive operation, but adding that method to FsUtils was the only way I could find getting the lookup from StoreFile to CF/Table. That could be cached, but there are all sorts of issues with cache invalidation every times a new store-file is written. 5.Default Values JD told me that BlockCacheSummaryEntry should have those with default values for the purpose of serialization (same for default constructor)! :-) blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch This is the backend work for the blockCache summary. Change to BlockCache interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to HRegionInterface, and HRegionServer. This will NOT include any of the web UI or anything else like that. That is for another sub-task. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087474#comment-13087474 ] stack commented on HBASE-4199: -- On 1., BlockCacheColumnFamilySummary? CF needs the table to qualify it but scope is column family? On 2., Normally that is so but looking in the Interfaces, e.g. HRegionInterface.java, I see that some methods take Lists. This is the class that does the serialization: src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java It seems to do some special stuff to do a List. Presumes List of Writables. On 3., Yeah. That was obvious (smile). On 4., Yes. You thing it will be invoked frequently? On 5., Hmm... yeah, or just check for null when serializing. Good stuff. blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch This is the backend work for the blockCache summary. Change to BlockCache interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to HRegionInterface, and HRegionServer. This will NOT include any of the web UI or anything else like that. That is for another sub-task. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087475#comment-13087475 ] stack commented on HBASE-4199: -- On 1., BlockCacheColumnFamilySummary? CF needs the table to qualify it but scope is column family? On 2., Normally that is so but looking in the Interfaces, e.g. HRegionInterface.java, I see that some methods take Lists. This is the class that does the serialization: src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java It seems to do some special stuff to do a List. Presumes List of Writables. On 3., Yeah. That was obvious (smile). On 4., Yes. You thing it will be invoked frequently? On 5., Hmm... yeah, or just check for null when serializing. Good stuff. blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch This is the backend work for the blockCache summary. Change to BlockCache interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to HRegionInterface, and HRegionServer. This will NOT include any of the web UI or anything else like that. That is for another sub-task. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087479#comment-13087479 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/ --- (Updated 2011-08-19 03:05:29.951253) Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. Changes --- Cacheable interface now far less confusing. HFileBlock has reformatting reverted. Summary --- Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. This addresses bug HBase-4027. https://issues.apache.org/jira/browse/HBase-4027 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d CHANGES.txt 763ddbc conf/hbase-env.sh 2d55d27 src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 9a71fdf src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 431f313 src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java 8a69a39 src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 Diff: https://reviews.apache.org/r/1214/diff Testing --- Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. Thanks, Li Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087482#comment-13087482 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1543 --- Todd: I'll get the fully serialized model out by tonight. This still leaves around 60 bytes of stuff on the heap. - Li On 2011-08-19 03:05:29, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-19 03:05:29) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.CHANGES.txt 763ddbc bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 9a71fdf bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 431f313 bq. src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java 8a69a39 bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 bq. src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 bq. bq. Diff: https://reviews.apache.org/r/1214/diff bq. bq. bq. Testing bq. --- bq. bq. Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. bq. bq. bq. Thanks, bq. bq. Li bq. bq. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using
[jira] [Commented] (HBASE-4008) Problem while stopping HBase
[ https://issues.apache.org/jira/browse/HBASE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087494#comment-13087494 ] stack commented on HBASE-4008: -- bq. Also about including the isAborted in the Abortable interface is there a JIRA already open for this? Could I take that up ? No. Please. Problem while stopping HBase Key: HBASE-4008 URL: https://issues.apache.org/jira/browse/HBASE-4008 Project: HBase Issue Type: Bug Components: scripts Reporter: Akash Ashok Assignee: Akash Ashok Labels: HMaster Fix For: 0.92.0 Attachments: HBase-4008-v2.patch, HBase-4008.patch stop-hbase.sh stops the server successfully if and only if the server is instantiated properly. When u Run start-hbase.sh; sleep 10; stop-hbase.sh; ( This works totally fine and has no issues ) Whereas when u run start-hbase.sh; stop-hbase.sh; ( This never stops the server and neither the server gets initialized and starts properly ) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4008) Problem while stopping HBase
[ https://issues.apache.org/jira/browse/HBASE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4008: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the patch Akash. Problem while stopping HBase Key: HBASE-4008 URL: https://issues.apache.org/jira/browse/HBASE-4008 Project: HBase Issue Type: Bug Components: scripts Reporter: Akash Ashok Assignee: Akash Ashok Labels: HMaster Fix For: 0.92.0 Attachments: HBase-4008-v2.patch, HBase-4008.patch stop-hbase.sh stops the server successfully if and only if the server is instantiated properly. When u Run start-hbase.sh; sleep 10; stop-hbase.sh; ( This works totally fine and has no issues ) Whereas when u run start-hbase.sh; stop-hbase.sh; ( This never stops the server and neither the server gets initialized and starts properly ) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4227) Modify the webUI so that default values of column families are not shown
[ https://issues.apache.org/jira/browse/HBASE-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nileema shingte updated HBASE-4227: --- Release Note: This patch addresses the following: 1. On the master web UI, the table description lists out only column families with non-default values. 2. A details link through which the complete table description is available. Status: Patch Available (was: Open) Modify the webUI so that default values of column families are not shown Key: HBASE-4227 URL: https://issues.apache.org/jira/browse/HBASE-4227 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: nileema shingte Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4227.patch With the introduction of online schema changes, it will become more advantageous to put configuration knobs at the column family level vs global configuration settings. This will create a nasty web UI experience for showing table properties unless we default to showing the custom values instead of all values. It's on the table if we want to modify the shell's 'describe' method as well. scan '.META.' should definitely return the full properties however. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4229: - Fix Version/s: 0.92.0 Status: Patch Available (was: Open) Marking patch available Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial Fix For: 0.92.0 Attachments: HBASE-4229.patch HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087497#comment-13087497 ] stack commented on HBASE-4229: -- I am +1 on the patch. It looks great. Thanks for going back and retrofitting. I'm letting this patch hang a while because it would be good if Andrew took a looksee. Meantime running tests to make sure avro or rest don't barf. Good one Riley. Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial Fix For: 0.92.0 Attachments: HBASE-4229.patch HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4227) Modify the webUI so that default values of column families are not shown
[ https://issues.apache.org/jira/browse/HBASE-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087500#comment-13087500 ] stack commented on HBASE-4227: -- Looks great. getDefaultValues looks like it should be a static map though, what you think Nileema? Something like: {code} private final static MapString, String DEFAULT_VALUES = new HashMapString, String(); static { defaultValues.put(BLOOMFILTER, DEFAULT_BLOOMFILTER); DEFAULT_VALUES.put(REPLICATION_SCOPE, String.valueOf(DEFAULT_REPLICATION_SCOPE)); DEFAULT_VALUES.put(HConstants.VERSIONS, String.valueOf(DEFAULT_VERSIONS)); DEFAULT_VALUES.put(COMPRESSION, DEFAULT_COMPRESSION); DEFAULT_VALUES.put(TTL, String.valueOf(DEFAULT_TTL)); DEFAULT_VALUES.put(BLOCKSIZE, String.valueOf(DEFAULT_BLOCKSIZE)); DEFAULT_VALUES.put(HConstants.IN_MEMORY, String.valueOf(DEFAULT_IN_MEMORY)); DEFAULT_VALUES.put(BLOCKCACHE, String.valueOf(DEFAULT_BLOCKCACHE)); } {code} No biggie. I can commit w/o it but if you make a new patch that'd be good too. Thanks. Modify the webUI so that default values of column families are not shown Key: HBASE-4227 URL: https://issues.apache.org/jira/browse/HBASE-4227 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: nileema shingte Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4227.patch With the introduction of online schema changes, it will become more advantageous to put configuration knobs at the column family level vs global configuration settings. This will create a nasty web UI experience for showing table properties unless we default to showing the custom values instead of all values. It's on the table if we want to modify the shell's 'describe' method as well. scan '.META.' should definitely return the full properties however. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087503#comment-13087503 ] Subbu M Iyer commented on HBASE-4213: - Attached version 1 with most of call outs addressed. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subbu M Iyer updated HBASE-4213: Attachment: HBASE-4213_Instant_schema_change_-Version_2_.patch Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked
[ https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4095: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed branch and trunk. Thank you for your perserverance Jieshan. Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked Key: HBASE-4095 URL: https://issues.apache.org/jira/browse/HBASE-4095 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, surefire-report-branch.html Some large Hlog files(Larger than 10G) appeared in our environment, and I got the reason why they got so huge: 1. The replicas is less than the expect number. So the method of checkLowReplication will be called each sync. 2. The method checkLowReplication request log-roll first, and set logRollRequested as true: {noformat} private void checkLowReplication() { // if the number of replicas in HDFS has fallen below the initial // value, then roll logs. try { int numCurrentReplicas = getLogReplication(); if (numCurrentReplicas != 0 numCurrentReplicas this.initialReplication) { LOG.warn(HDFS pipeline error detected. + Found + numCurrentReplicas + replicas but expecting + this.initialReplication + replicas. + Requesting close of hlog.); requestLogRoll(); logRollRequested = true; } } catch (Exception e) { LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e + still proceeding ahead...); } } {noformat} 3.requestLogRoll() just commit the roll request. It may not execute in time, for it must got the un-fair lock of cacheFlushLock. But the lock may be carried by the cacheflush threads. 4.logRollRequested was true until the log-roll executed. So during the time, each request of log-roll in sync() was skipped. Here's the logs while the problem happened(Please notice the file size of hlog 193-195-5-111%3A20020.1309937386639 in the last row): 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119, entries=32434, filesize=239589754. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas. Requesting close of hlog. 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847 to hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983, entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~207.5m for region Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 10839ms, sequenceid=248900, compaction requested=false 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955, entries=216459, filesize=2370387468. New hlog /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll
[jira] [Updated] (HBASE-2399) Forced splits only act on the first family in a table
[ https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HBASE-2399: --- Resolution: Fixed Status: Resolved (was: Patch Available) Forced splits only act on the first family in a table - Key: HBASE-2399 URL: https://issues.apache.org/jira/browse/HBASE-2399 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.20.3 Reporter: Jonathan Gray Assignee: Ming Ma Priority: Critical Labels: moved_from_0_20_5 Fix For: 0.92.0 Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch While working on a patch for HBASE-2375, I came across a few bugs in the existing code related to splits. If a user triggers a manual split, it flips a forceSplit boolean to true and then triggers a compaction (this is very similar to my current implementation for HBASE-2375). However, the forceSplit boolean is flipped back to false at the beginning of Store.compact(). So the force split only acts on the first family in the table. If that Store is not splittable for some reason (it is empty or has only one row), then the entire region will not be split, regardless of what is in other families. Even if there is data in the first family, the midKey is determined based solely on that family. If it has two rows and the next family has 1M rows, we pick the split key based on the two rows. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4227) Modify the webUI so that default values of column families are not shown
[ https://issues.apache.org/jira/browse/HBASE-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087506#comment-13087506 ] nileema shingte commented on HBASE-4227: Yes, it should be a static map. I will send across a new patch for this. Thanks for the prompt review, Stack! Modify the webUI so that default values of column families are not shown Key: HBASE-4227 URL: https://issues.apache.org/jira/browse/HBASE-4227 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: nileema shingte Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4227.patch With the introduction of online schema changes, it will become more advantageous to put configuration knobs at the column family level vs global configuration settings. This will create a nasty web UI experience for showing table properties unless we default to showing the custom values instead of all values. It's on the table if we want to modify the shell's 'describe' method as well. scan '.META.' should definitely return the full properties however. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087507#comment-13087507 ] Subbu M Iyer commented on HBASE-4213: - If a new region for this table comes in meantime, it'll be ok? It'll find new schema? Yes. New regions are perfectly fine as they will see new HTD. What happens if master decides to balance a region at this time? Or disable it? While reopenRegions is running? Hmm. This one is interesting. Let me take a closer look at how to address it. Any suggestions? What Ted says on delete cf. TBD. I am on it. Can you look in zk or something rather than wait on a timer before moving on? Yup. Will do. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087511#comment-13087511 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1544 --- Cacheable interface is much more intuitive now. src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java https://reviews.apache.org/r/1214/#comment3516 Change an to the. src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java https://reviews.apache.org/r/1214/#comment3515 Do I see an incomplete sentence here ? src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java https://reviews.apache.org/r/1214/#comment3514 If self is always returned, why do we need the return value here ? - Ted On 2011-08-19 03:05:29, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-19 03:05:29) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.CHANGES.txt 763ddbc bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 9a71fdf bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 431f313 bq. src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java 8a69a39 bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 bq. src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 bq. bq. Diff: https://reviews.apache.org/r/1214/diff bq. bq. bq. Testing bq. --- bq. bq. Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. bq. bq. bq. Thanks, bq. bq. Li bq. bq. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff,
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087514#comment-13087514 ] Subbu M Iyer commented on HBASE-4213: - I did test delete family from shell and it worked with out any additional support. I just found that TableDeleteFamilyHandler is using event type EventType.C_M_ADD_FAMILY instead of EventType.C_M_DELETE_FAMILY. So, delete family is already taken care but i will fix the DeleteHandler to use the correct event type as well as well as incorporate the required changes to reflect the same during instant schema change. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087517#comment-13087517 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/ --- (Updated 2011-08-19 04:16:44.682053) Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. Changes --- fixed as per ted yu's comments above. Summary --- Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. This addresses bug HBase-4027. https://issues.apache.org/jira/browse/HBase-4027 Diffs (updated) - CHANGES.txt 0478003 conf/hbase-env.sh 2d55d27 src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java c301d1b src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 5063896 Diff: https://reviews.apache.org/r/1214/diff Testing --- Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. Thanks, Li Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087518#comment-13087518 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- bq. On 2011-08-19 04:10:41, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java, line 49 bq. https://reviews.apache.org/r/1214/diff/15/?file=33671#file33671line49 bq. bq. If self is always returned, why do we need the return value here ? it returns a copy of itself, unless it doesn't need deserialization. - Li --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1544 --- On 2011-08-19 04:16:44, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-19 04:16:44) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.CHANGES.txt 0478003 bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java c301d1b bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 bq. src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 5063896 bq. bq. Diff: https://reviews.apache.org/r/1214/diff bq. bq. bq. Testing bq. --- bq. bq. Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. bq. bq. bq. Thanks, bq. bq. Li bq. bq. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff, hbase4027v12.diff, slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087521#comment-13087521 ] Ted Yu commented on HBASE-4213: --- I think load balancer should be temporarily disabled at the beginning of instant schema change. At the end, balancer should be enabled. We should perform some testing on a table which has over 400 regions to see how long it takes for a cluster under normal load. If it is not too long, disabling balancer should be fine. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087523#comment-13087523 ] Subbu M Iyer commented on HBASE-4213: - Yup. sure that makes perfect sense. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4202) Check filesystem permissions on startup
[ https://issues.apache.org/jira/browse/HBASE-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4202: - Priority: Trivial (was: Major) Thanks for doing the check Ram. I'm marking this a 'trivial' bug since it seems it only an issue in old 0.20.x hbase fixed (it looks like) in 0.90.x Check filesystem permissions on startup --- Key: HBASE-4202 URL: https://issues.apache.org/jira/browse/HBASE-4202 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.20.4 Environment: debian squeeze Reporter: Matthias Hofschen Assignee: ramkrishna.s.vasudevan Priority: Trivial Labels: noob We added a new node to a 44 node cluster starting the datanode, mapred and regionserver processes on it. The Unix filesystem was configured incorrectly, i.e. /tmp was not writable to processes. All three processes had issues with this. Datanode and mapred shutdown on exception. Regionserver did not stop, in fact reported to master that its up without regions. So master assigned regions to it. Regionserver would not accept them, resulting in a constant assign, reject, reassign cycle, that put many regions into a state of not being available. There are no logs about this, but we could observer the regioncount fluctuate by hundredths of regions and the application throwing many NotServingRegion exceptions. In fact to the master process the regionserver looked fine, so it was trying to send regions its way. Regionserver rejected them. So the master/balancer was going into a assign/reassign cycle destabilizing the cluster. Many puts and gets simply failed with NotServingRegionExceptions and took a long time to complete. Exception from regionserver: 2011-08-06 23:57:13,953 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: NodeCreated, path: /hbase/master 2011-08-06 23:57:13,957 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 17.1.0.1:6 that we are up 2011-08-06 23:57:13,957 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 17.1.0.1:6 that we are up 2011-08-07 00:07:39.648::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2011-08-07 00:07:39.712::INFO: jetty-6.1.14 2011-08-07 00:07:39.742::WARN: tmpdir java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:222) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461) at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168) at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430) at java.lang.Thread.run(Thread.java:619) Exception from datanode: 2011-08-06 23:37:20,444 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075 2011-08-06 23:37:20,444 INFO org.mortbay.log: jetty-6.1.14 2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at
[jira] [Commented] (HBASE-4228) Add a method to get a list of HLog files for a RS.
[ https://issues.apache.org/jira/browse/HBASE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087526#comment-13087526 ] stack commented on HBASE-4228: -- Do you have a patch Madhuwanti? (Why you need this) Add a method to get a list of HLog files for a RS. -- Key: HBASE-4228 URL: https://issues.apache.org/jira/browse/HBASE-4228 Project: HBase Issue Type: New Feature Components: regionserver Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Trivial -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087528#comment-13087528 ] stack commented on HBASE-3845: -- @Gaojinchao Do you need this on branch? data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 Attachments: 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087530#comment-13087530 ] Jieshan Bean commented on HBASE-3845: - Yes, we need this patch on branch:) data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 Attachments: 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087533#comment-13087533 ] stack commented on HBASE-4226: -- @Li Who every told you it was a good idea to make stylistic changes to the code base? Don't you know that making style changes is like poking a hornets nest with a big old stick; even the old codger wasps are going to come out w/ stingers deployed. In below change: {code} -throw new IOException(Block type stored in the buffer: + -blockTypeFromBuf + , block type field: + blockType); +throw new IOException(Block type stored in the buffer: ++ blockTypeFromBuf + , block type field: + blockType); {code} ... i personally prefer the former where there is a hanging '+' on the end of the line. The hanging plus indicates a line continued (w/o the '+' I find you need to do a bit more work to figure next line is a continuation). Here, stylisitically, I like that the second parameter is complete on the second line rather than broken across lines: {code} - Preconditions.checkState(state != State.INIT, - Unexpected state: + state); + Preconditions.checkState(state != State.INIT, Unexpected state: + + state); {code} So, I agree w/ about 2/3rds of this patch. (My guess is that if you fix the above so I like it, Ted Yu is going to show up next and disagree w/ a different third of the changes -- smile) HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff, hbase-4226v2.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087534#comment-13087534 ] stack commented on HBASE-3845: -- Ok. I wrote Gao to suggest he figure out what was finally applied to branch here, make a version of it for 0.90, test it, and apply the file here. I'll commit it then. data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 Attachments: 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087538#comment-13087538 ] stack commented on HBASE-4218: -- /me hearts this issue Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Reporter: Jacek Migdal Labels: compression A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4226) HFileBlock.java style cleanup.
[ https://issues.apache.org/jira/browse/HBASE-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087539#comment-13087539 ] Li Pi commented on HBASE-4226: -- I'm gonna cut this patch down to the single bit about brackets. I think that, we can all agree, is bad. Or I might just sneak fixes into 4027. HFileBlock.java style cleanup. -- Key: HBASE-4226 URL: https://issues.apache.org/jira/browse/HBASE-4226 Project: HBase Issue Type: Improvement Reporter: Li Pi Assignee: Li Pi Priority: Trivial Attachments: hbase-4226.diff, hbase-4226v2.diff Just a simple style cleanup of HFileBlock.java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4221) Changes necessary to build and run against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087540#comment-13087540 ] stack commented on HBASE-4221: -- +1 on patch. I like the profiles fancy-dancing. Why do we have to have VersionedProtocol local (Should you add a comment to the class on commit saying its copied from our mother?) Changes necessary to build and run against Hadoop 0.23 -- Key: HBASE-4221 URL: https://issues.apache.org/jira/browse/HBASE-4221 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: hbase-4221.txt A few modifications necessary to run against today's trunk: - copy-paste VersionedProtocol into the hbase IPC package - upgrade protobufs to 2.4.0a - fix one of the tests in TestHFileOutputFormat for new TaskAttemptContext API - remove illegal accesses to private members of FSNamesystem in tests (use reflection) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4221) Changes necessary to build and run against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4221: - Priority: Critical (was: Major) Changes necessary to build and run against Hadoop 0.23 -- Key: HBASE-4221 URL: https://issues.apache.org/jira/browse/HBASE-4221 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: hbase-4221.txt A few modifications necessary to run against today's trunk: - copy-paste VersionedProtocol into the hbase IPC package - upgrade protobufs to 2.4.0a - fix one of the tests in TestHFileOutputFormat for new TaskAttemptContext API - remove illegal accesses to private members of FSNamesystem in tests (use reflection) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira