[jira] [Reopened] (HBASE-5792) HLog Performance Evaluation Tool
[ https://issues.apache.org/jira/browse/HBASE-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5792: -- > HLog Performance Evaluation Tool > > > Key: HBASE-5792 > URL: https://issues.apache.org/jira/browse/HBASE-5792 > Project: HBase > Issue Type: Test > Components: wal >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi >Priority: Minor > Labels: performance, wal > Fix For: 0.94.0, 0.96.0 > > Attachments: HBASE-5792-v0.patch, HBASE-5792-v1.patch, > HBASE-5792-v2.patch, verify.txt, verify.txt > > > Related to HDFS-3280 and the HBase WAL slowdown on 0.23+ > It would be nice to have a simple tool like HFilePerformanceEvaluation, ... > to be able to check easily the HLog performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5778: -- Assignee: Lars Hofhansl (was: Jean-Daniel Cryans) > Turn on WAL compression by default > -- > > Key: HBASE-5778 > URL: https://issues.apache.org/jira/browse/HBASE-5778 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Blocker > Fix For: 0.94.0, 0.96.0 > > Attachments: HBASE-5778.patch > > > I ran some tests to verify if WAL compression should be turned on by default. > For a use case where it's not very useful (values two order of magnitude > bigger than the keys), the insert time wasn't different and the CPU usage 15% > higher (150% CPU usage VS 130% when not compressing the WAL). > When values are smaller than the keys, I saw a 38% improvement for the insert > run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure > WAL compression accounts for all the additional CPU usage, it might just be > that we're able to insert faster and we spend more time in the MemStore per > second (because our MemStores are bad when they contain tens of thousands of > values). > Those are two extremes, but it shows that for the price of some CPU we can > save a lot. My machines have 2 quads with HT, so I still had a lot of idle > CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.
[ https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5569: -- > Do not collect deleted KVs when they are still in use by a scanner. > --- > > Key: HBASE-5569 > URL: https://issues.apache.org/jira/browse/HBASE-5569 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.94.0, 0.96.0 > > Attachments: 5569-v2.txt, 5569.txt, > TestAtomicOperation-output.trunk_120313.rar > > > I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads > fails rarely. > The solution is similar to HBASE-2856, where expired KVs are not collected > when in use by a scanner. > --- > What I pieced together so far is that it is the *scanning* side that has > problems sometimes. > Every time I see a assertion failure in the log I see this before: > {quote} > 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): > Storescanner.peek() is changed where before = > rowB/colfamily11:qual1/75366/Put/vlen=6,and after = > rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0 > {quote} > The order of if the Put and Delete is sometimes reversed. > The test threads should always see exactly one KV, if the "before" was the > Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 > KVs. > This debug message comes from StoreScanner to checkReseek. It seems we still > some consistency issue with scanning sometimes :( -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-4542: -- Reopening so I won't forget about the 0.94 part. > add filter info to slow query logging > - > > Key: HBASE-4542 > URL: https://issues.apache.org/jira/browse/HBASE-4542 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.89.20100924 >Reporter: Kannan Muthukkaruppan >Assignee: Madhuwanti Vaidya > Fix For: 0.94.0, 0.96.0 > > Attachments: > 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, > Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, > D1263.2.patch, D1539.1.patch > > > Slow query log doesn't report filters in effect. > For example: > {code} > (operationTooSlow): \ > {"processingtimems":3468,"client":"10.138.43.206:40035","timeRange": > [0,9223372036854775807],\ > "starttimems":1317772005821,"responsesize":42411, \ > "class":"HRegionServer","table":"myTable","families":{"CF1":"ALL"]},\ > "row":"6c3b8efa132f0219b7621ed1e5c8c70b","queuetimems":0,\ > "method":"get","totalColumns":1,"maxVersions":1,"storeLimit":-1} > {code} > the above would suggest that all columns of myTable:CF1 are being requested > for the given row. But in reality there could be filters in effect (such as > ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should > enhance the slow query log to capture & report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5480) Fixups to MultithreadedTableMapper for Hadoop 0.23.2+
[ https://issues.apache.org/jira/browse/HBASE-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5480: -- Reopening, so I won't forget about it. > Fixups to MultithreadedTableMapper for Hadoop 0.23.2+ > - > > Key: HBASE-5480 > URL: https://issues.apache.org/jira/browse/HBASE-5480 > Project: HBase > Issue Type: Bug > Components: mapreduce >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Critical > Fix For: 0.94.0 > > Attachments: HBASE-5480.patch > > > There are two issues: > - StatusReporter has a new method getProgress() > - Mapper and reducer context objects can no longer be directly instantiated. > See attached patch. I'm not thrilled with the added reflection but it was the > minimally intrusive change. > Raised the priority to critical because compilation fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5229) Explore building blocks for "multi-row" local transactions.
[ https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5229: -- Now that HBASE-5304 is committed, I would to rekindle the discussion here. We have need to be able to co-locate some data for our tenant (for example all data of a specific user within a specific tenant space), in order to provide fast atomic operations. With HBASE-5304 it is now possible to provide a split policy that (within the HBase constraints, of course) allows to guide the split process to co-locate parts of the data. What is missing is a small change on top of HBASE-3584 to allow atomic cross row operations. See the first patch I have attached here - 5229.txt. I would change the RegionMutation to MultiRowMutation, but in principle it would be the same. A very small, safe, incremental change. > Explore building blocks for "multi-row" local transactions. > --- > > Key: HBASE-5229 > URL: https://issues.apache.org/jira/browse/HBASE-5229 > Project: HBase > Issue Type: New Feature > Components: client, regionserver >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.94.0 > > Attachments: 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt > > > HBase should provide basic building blocks for multi-row local transactions. > Local means that we do this by co-locating the data. Global (cross region) > transactions are not discussed here. > After a bit of discussion two solutions have emerged: > 1. Keep the row-key for determining grouping and location and allow efficient > intra-row scanning. A client application would then model tables as > HBase-rows. > 2. Define a prefix-length in HTableDescriptor that defines a grouping of > rows. Regions will then never be split inside a grouping prefix. > #1 is true to the current storage paradigm of HBase. > #2 is true to the current client side API. > I will explore these two with sample patches here. > > Was: > As discussed (at length) on the dev mailing list with the HBASE-3584 and > HBASE-5203 committed, supporting atomic cross row transactions within a > region becomes simple. > I am aware of the hesitation about the usefulness of this feature, but we > have to start somewhere. > Let's use this jira for discussion, I'll attach a patch (with tests) > momentarily to make this concrete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5058) Allow HBaseAmin to use an existing connection
[ https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5058: -- Actually, never mind. I have a simple patch now, that bandaids the problem. > Allow HBaseAmin to use an existing connection > - > > Key: HBASE-5058 > URL: https://issues.apache.org/jira/browse/HBASE-5058 > Project: HBase > Issue Type: Sub-task > Components: client >Affects Versions: 0.94.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Minor > Fix For: 0.94.0 > > Attachments: 5058.txt > > > What HBASE-4805 does for HTables, this should do for HBaseAdmin. > Along with this the shared error handling and retrying between HBaseAdmin and > HConnectionManager can also be improved. I'll attach a first pass patch soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4673) NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0
[ https://issues.apache.org/jira/browse/HBASE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-4673: -- You are right. Didn't realized that jgray had checked the cache config code in 0.92 as well. > NPE in HFileReaderV2.close during major compaction when > hfile.block.cache.size is set to 0 > --- > > Key: HBASE-4673 > URL: https://issues.apache.org/jira/browse/HBASE-4673 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Minor > Fix For: 0.94.0 > > Attachments: 4673.txt > > > On a test system got this exception when hfile.block.cache.size is set to 0: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.close(HFileReaderV2.java:321) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1065) > at > org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:539) > at > org.apache.hadoop.hbase.regionserver.StoreFile.deleteReader(StoreFile.java:549) > at > org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:1314) > at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:686) > at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1016) > at > org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest.run(CompactionRequest.java:178) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > Minor issue as nobody in their right mind with have hfile.block.cache.size=0 > Looks like this is due to HBASE-4422 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4488) Store could miss rows during flush
[ https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-4488: -- Reopening for the related change to Store.compactStore > Store could miss rows during flush > -- > > Key: HBASE-4488 > URL: https://issues.apache.org/jira/browse/HBASE-4488 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: 0.92.0, 0.94.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.92.0 > > Attachments: 4488.txt > > > While looking at HBASE-4344 I found that my change HBASE-4241 contains a > critical mistake: > The while(scanner.next(kvs)) loop is incorrect and might miss the last edits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira