[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216370#comment-13216370 ] Max Lapan commented on HBASE-5416: -- @all Thanks for a discussion, I'll benchmark 2-phase approach, maybe it's a solution indeed. One thing still is not clear for me: how the batching factor could improve gets performance? The get request is synchronous, isn't it? So, in mapper, I issue get to obtain value from large column, and wait for it to be ready. In fact, single get won't be significantly havier than seek in scanner, but batching seems no help there. In fact, could be wrong there, didn't experiment much with that. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216355#comment-13216355 ] Zhihong Yu edited comment on HBASE-5416 at 2/25/12 3:09 PM: Atomcity can be achieved by applying the filter set twice. I agree with Mikhail that we need to have good code quality and decent unit test coverage. Complexity in the critical path might be a concern. Performance might be another as certain use cases benefit from the approach while others don't. Thus, we could consider execution plan from the relational database world for SQL tuning. Once the data is in the table, tune the execution plan (which way to go) against particular use case(s). Just my $0.02. was (Author: thomaspan): Atomcity can be achieved by applying the filter set twice. I agree with Mikhail that we need to have good code quality and decent unit test coverage. Complexity in the critical path might be a concern. Performance might be another as certain use cases benefit from the approach while others don't. Thus, we could consider execution plan from the rational database world for SQL tuning. Once the data is in the table, tune the execution plan (which way to go) against particular use case(s). Just my $0.02. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5478) TestMetaReaderEditor intermittently hangs in 0.92 branch
TestMetaReaderEditor intermittently hangs in 0.92 branch Key: HBASE-5478 URL: https://issues.apache.org/jira/browse/HBASE-5478 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Zhihong Yu From 0.92 build #304: {code} Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 149.104 sec {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216515#comment-13216515 ] Jean-Daniel Cryans commented on HBASE-4890: --- Very unlikely, 5336's NPE is coming from Hadoop land instead of our IPC and to me it looks like that file was already closed. fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5442) Use builder pattern in StoreFile and HFile
[ https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5442: --- Attachment: D1941.3.patch mbautin updated the revision [jira] [HBASE-5442] [89-fb] Use builder pattern in StoreFile and HFile. Reviewers: JIRA, khemani, Kannan, Liyin, Karthik, nspiegelberg Addressing Kannan's comment. Putting withFavoredNodes and getFavoredNodes back in. The getFavoredNode invocation was removed in https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1239905#fcf5a7a4 by mistake. REVISION DETAIL https://reviews.facebook.net/D1941 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java Use builder pattern in StoreFile and HFile -- Key: HBASE-5442 URL: https://issues.apache.org/jira/browse/HBASE-5442 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: D1893.1.patch, D1893.2.patch, D1941.1.patch, D1941.2.patch, D1941.3.patch, HFile-StoreFile-builder-2012-02-22_22_49_00.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses StoreFile and HFile refactoring. For HColumnDescriptor refactoring see HBASE-5357. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5442) Use builder pattern in StoreFile and HFile
[ https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5442: --- Attachment: D1941.4.patch mbautin updated the revision [jira] [HBASE-5442] [89-fb] Use builder pattern in StoreFile and HFile. Reviewers: JIRA, khemani, Kannan, Liyin, Karthik, nspiegelberg Removing unused imports in HBaseTestingUtility REVISION DETAIL https://reviews.facebook.net/D1941 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java Use builder pattern in StoreFile and HFile -- Key: HBASE-5442 URL: https://issues.apache.org/jira/browse/HBASE-5442 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: D1893.1.patch, D1893.2.patch, D1941.1.patch, D1941.2.patch, D1941.3.patch, D1941.4.patch, HFile-StoreFile-builder-2012-02-22_22_49_00.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses StoreFile and HFile refactoring. For HColumnDescriptor refactoring see HBASE-5357. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2462) Review compaction heuristic and move compaction code out so standalone and independently testable
[ https://issues.apache.org/jira/browse/HBASE-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216581#comment-13216581 ] stack commented on HBASE-2462: -- It likes a bunch of this has made it in but I don't see the standalone compactions part nor the simulator. I'm taking a look at salvaging these latter two aspects from this patch and at least making it so we have standalone compactions (I want to look at compactions in isolation to see if we can make them run faster; we also need to work on making it so we do less of them but thats other issues). Review compaction heuristic and move compaction code out so standalone and independently testable - Key: HBASE-2462 URL: https://issues.apache.org/jira/browse/HBASE-2462 Project: HBase Issue Type: Improvement Components: performance Reporter: stack Assignee: Jonathan Gray Priority: Critical Labels: moved_from_0_20_5 Anything that improves our i/o profile makes hbase run smoother. Over in HBASE-2457, good work has been done already describing the tension between minimizing compactions versus minimizing count of store files. This issue is about following on from what has been done in 2457 but also, breaking the hard-to-read compaction code out of Store.java out to a standalone class that can be the easier tested (and easily analyzed for its performance characteristics). If possible, in the refactor, we'd allow specification of alternate merge sort implementations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216584#comment-13216584 ] Lars Hofhansl commented on HBASE-5075: -- @zhiyuan.dai: What do you think? regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: Degion of Failure Detection.pdf, HBase-5075-shell.patch, HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216589#comment-13216589 ] Lars Hofhansl commented on HBASE-5075: -- On 2nd thought. The ephemeral node can only be deleted as long as the ZK connection is active, which is by no means guaranteed for any shutdown hook, also not sure about causing network IO from a shutdown hook. Looking at the HRegion.run() it looks like we pretty much in all cases reach the point where deleteMyEphemeralNode is called. Hmm... regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: Degion of Failure Detection.pdf, HBase-5075-shell.patch, HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5479) Postpone CompactionSelection to compaction execution time
Postpone CompactionSelection to compaction execution time - Key: HBASE-5479 URL: https://issues.apache.org/jira/browse/HBASE-5479 Project: HBase Issue Type: New Feature Components: io, performance, regionserver Reporter: Matt Corgan It can be commonplace for regionservers to develop long compaction queues, meaning a CompactionRequest may execute hours after it was created. The CompactionRequest holds a CompactionSelection that was selected at request time but may no longer be the optimal selection. The CompactionSelection should be created at compaction execution time rather than compaction request time. The current mechanism breaks down during high volume insertion. The inefficiency is clearest when the inserts are finished. Inserting for 5 hours may build up 50 storefiles and a 40 element compaction queue. When finished inserting, you would prefer that the next compaction merges all 50 files (or some large subset), but the current system will churn through each of the 40 compaction requests, the first of which may be hours old. This ends up re-compacting the same data many times. The current system is especially inefficient when dealing with time series data where the data in the storefiles has minimal overlap. With time series data, there is even less benefit to intermediate merges because most storefiles can be eliminated based on their key range during a read, even without bloomfilters. The only goal should be to reduce file count, not to minimize number of files merged for each read. There are other aspects to the current queuing mechanism that would need to be looked at. You would want to avoid having the same Store in the queue multiple times. And you would want the completion of one compaction to possibly queue another compaction request for the store. A alternative architecture to the current style of queues would be to have each Store (all open in memory) keep a compactionPriority score up to date after events like flushes, compactions, schema changes, etc. Then you create a CompactionPriorityComparator implements ComparatorStore and stick all the Stores into a PriorityQueue (synchronized remove/add from the queue when the value changes). The async compaction threads would keep pulling off the head of that queue as long as the head has compactionPriority X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5479) Postpone CompactionSelection to compaction execution time
[ https://issues.apache.org/jira/browse/HBASE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216608#comment-13216608 ] stack commented on HBASE-5479: -- Todd suggests something like a scoring over here Matt: https://issues.apache.org/jira/browse/HBASE-2457?focusedCommentId=12857705page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12857705 Lets verify that we do indeed do selection at queuing time. Thats my suspicion. If thats the case, for sure needs fixing. Thanks for filing this one Matt. Postpone CompactionSelection to compaction execution time - Key: HBASE-5479 URL: https://issues.apache.org/jira/browse/HBASE-5479 Project: HBase Issue Type: New Feature Components: io, performance, regionserver Reporter: Matt Corgan It can be commonplace for regionservers to develop long compaction queues, meaning a CompactionRequest may execute hours after it was created. The CompactionRequest holds a CompactionSelection that was selected at request time but may no longer be the optimal selection. The CompactionSelection should be created at compaction execution time rather than compaction request time. The current mechanism breaks down during high volume insertion. The inefficiency is clearest when the inserts are finished. Inserting for 5 hours may build up 50 storefiles and a 40 element compaction queue. When finished inserting, you would prefer that the next compaction merges all 50 files (or some large subset), but the current system will churn through each of the 40 compaction requests, the first of which may be hours old. This ends up re-compacting the same data many times. The current system is especially inefficient when dealing with time series data where the data in the storefiles has minimal overlap. With time series data, there is even less benefit to intermediate merges because most storefiles can be eliminated based on their key range during a read, even without bloomfilters. The only goal should be to reduce file count, not to minimize number of files merged for each read. There are other aspects to the current queuing mechanism that would need to be looked at. You would want to avoid having the same Store in the queue multiple times. And you would want the completion of one compaction to possibly queue another compaction request for the store. A alternative architecture to the current style of queues would be to have each Store (all open in memory) keep a compactionPriority score up to date after events like flushes, compactions, schema changes, etc. Then you create a CompactionPriorityComparator implements ComparatorStore and stick all the Stores into a PriorityQueue (synchronized remove/add from the queue when the value changes). The async compaction threads would keep pulling off the head of that queue as long as the head has compactionPriority X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216625#comment-13216625 ] Jesse Yates commented on HBASE-5075: Had the same concerns about the network IO and (I think) blocking call. However, with the shutdown hook, I think we can be _more sure_ that it runs, rather than putting it after the run method. Also, the hooks run in their own thread, so on shutdown, its not going to block regular shutdown or any other synchronous operations. Granted, this doesn't deal with the kill -9 or network partition situation, but if that happens, you have some big problems anyways and a minute (or whatever your zk timeout is) of blocking probably isn't a big deal ;) Also note, that in the latter case there, the daemon wouldn't be able to reach zk anyways to eliminate the node, so you are back to where you were before. regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: Degion of Failure Detection.pdf, HBase-5075-shell.patch, HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5479) Postpone CompactionSelection to compaction execution time
[ https://issues.apache.org/jira/browse/HBASE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216639#comment-13216639 ] Zhihong Yu commented on HBASE-5479: --- I think compactionPriority score needs to be designed in such a way, when multiple column families are involved, that no single column family would consistently come off the head of PriorityQueue for extended period of time. Postpone CompactionSelection to compaction execution time - Key: HBASE-5479 URL: https://issues.apache.org/jira/browse/HBASE-5479 Project: HBase Issue Type: New Feature Components: io, performance, regionserver Reporter: Matt Corgan It can be commonplace for regionservers to develop long compaction queues, meaning a CompactionRequest may execute hours after it was created. The CompactionRequest holds a CompactionSelection that was selected at request time but may no longer be the optimal selection. The CompactionSelection should be created at compaction execution time rather than compaction request time. The current mechanism breaks down during high volume insertion. The inefficiency is clearest when the inserts are finished. Inserting for 5 hours may build up 50 storefiles and a 40 element compaction queue. When finished inserting, you would prefer that the next compaction merges all 50 files (or some large subset), but the current system will churn through each of the 40 compaction requests, the first of which may be hours old. This ends up re-compacting the same data many times. The current system is especially inefficient when dealing with time series data where the data in the storefiles has minimal overlap. With time series data, there is even less benefit to intermediate merges because most storefiles can be eliminated based on their key range during a read, even without bloomfilters. The only goal should be to reduce file count, not to minimize number of files merged for each read. There are other aspects to the current queuing mechanism that would need to be looked at. You would want to avoid having the same Store in the queue multiple times. And you would want the completion of one compaction to possibly queue another compaction request for the store. A alternative architecture to the current style of queues would be to have each Store (all open in memory) keep a compactionPriority score up to date after events like flushes, compactions, schema changes, etc. Then you create a CompactionPriorityComparator implements ComparatorStore and stick all the Stores into a PriorityQueue (synchronized remove/add from the queue when the value changes). The async compaction threads would keep pulling off the head of that queue as long as the head has compactionPriority X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5479) Postpone CompactionSelection to compaction execution time
[ https://issues.apache.org/jira/browse/HBASE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216639#comment-13216639 ] Zhihong Yu edited comment on HBASE-5479 at 2/26/12 5:46 AM: I think compactionPriority score needs to be designed in such a way, when multiple column families are involved, that no single column family would exclusively come off the head of PriorityQueue for extended period of time. was (Author: zhi...@ebaysf.com): I think compactionPriority score needs to be designed in such a way, when multiple column families are involved, that no single column family would consistently come off the head of PriorityQueue for extended period of time. Postpone CompactionSelection to compaction execution time - Key: HBASE-5479 URL: https://issues.apache.org/jira/browse/HBASE-5479 Project: HBase Issue Type: New Feature Components: io, performance, regionserver Reporter: Matt Corgan It can be commonplace for regionservers to develop long compaction queues, meaning a CompactionRequest may execute hours after it was created. The CompactionRequest holds a CompactionSelection that was selected at request time but may no longer be the optimal selection. The CompactionSelection should be created at compaction execution time rather than compaction request time. The current mechanism breaks down during high volume insertion. The inefficiency is clearest when the inserts are finished. Inserting for 5 hours may build up 50 storefiles and a 40 element compaction queue. When finished inserting, you would prefer that the next compaction merges all 50 files (or some large subset), but the current system will churn through each of the 40 compaction requests, the first of which may be hours old. This ends up re-compacting the same data many times. The current system is especially inefficient when dealing with time series data where the data in the storefiles has minimal overlap. With time series data, there is even less benefit to intermediate merges because most storefiles can be eliminated based on their key range during a read, even without bloomfilters. The only goal should be to reduce file count, not to minimize number of files merged for each read. There are other aspects to the current queuing mechanism that would need to be looked at. You would want to avoid having the same Store in the queue multiple times. And you would want the completion of one compaction to possibly queue another compaction request for the store. A alternative architecture to the current style of queues would be to have each Store (all open in memory) keep a compactionPriority score up to date after events like flushes, compactions, schema changes, etc. Then you create a CompactionPriorityComparator implements ComparatorStore and stick all the Stores into a PriorityQueue (synchronized remove/add from the queue when the value changes). The async compaction threads would keep pulling off the head of that queue as long as the head has compactionPriority X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5478) TestMetaReaderEditor intermittently hangs in 0.92 branch
[ https://issues.apache.org/jira/browse/HBASE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216655#comment-13216655 ] Zhihong Yu commented on HBASE-5478: --- From the log, we can see that testScanMetaForTable() wasn't able to create table: {code} 2012-02-25 01:51:47,262 FATAL [Thread-258] regionserver.HRegionServer(1544): ABORTING region server janus.apache.org,50277,1330134696463: Aborting for tests java.lang.Exception: Trace info at org.apache.hadoop.hbase.MiniHBaseCluster.abortRegionServer(MiniHBaseCluster.java:237) at org.apache.hadoop.hbase.catalog.TestMetaReaderEditor.testRetrying(TestMetaReaderEditor.java:138) ... 2012-02-25 01:58:17,615 INFO [reader] catalog.TestMetaReaderEditor$MetaTask(174): reader failed org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Sat Feb 25 01:51:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server janus.apache.org,50277,1330134696463 not running, aborting at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2867) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1781) at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) Sat Feb 25 01:52:27 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:53:07 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:53:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:54:28 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:55:09 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:55:52 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:56:35 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:57:22 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 Sat Feb 25 01:58:17 UTC 2012, org.apache.hadoop.hbase.client.HTable$5@1ba92bb, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to janus.apache.org/67.195.138.60:41354 after attempts=1 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:657) at org.apache.hadoop.hbase.catalog.MetaReader.get(MetaReader.java:247) at org.apache.hadoop.hbase.catalog.MetaReader.getRegion(MetaReader.java:349) at org.apache.hadoop.hbase.catalog.TestMetaReaderEditor.testGetRegion(TestMetaReaderEditor.java:286) at org.apache.hadoop.hbase.catalog.TestMetaReaderEditor.access$100(TestMetaReaderEditor.java:52) at