[jira] [Reopened] (HBASE-5792) HLog Performance Evaluation Tool

2012-04-18 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5792:
--


> HLog Performance Evaluation Tool
> 
>
> Key: HBASE-5792
> URL: https://issues.apache.org/jira/browse/HBASE-5792
> Project: HBase
>  Issue Type: Test
>  Components: wal
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
>  Labels: performance, wal
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5792-v0.patch, HBASE-5792-v1.patch, 
> HBASE-5792-v2.patch, verify.txt, verify.txt
>
>
> Related to HDFS-3280 and the HBase WAL slowdown on 0.23+
> It would be nice to have a simple tool like HFilePerformanceEvaluation, ...
> to be able to check easily the HLog performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5778:
--

  Assignee: Lars Hofhansl  (was: Jean-Daniel Cryans)

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.

2012-03-16 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5569:
--


> Do not collect deleted KVs when they are still in use by a scanner.
> ---
>
> Key: HBASE-5569
> URL: https://issues.apache.org/jira/browse/HBASE-5569
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5569-v2.txt, 5569.txt, 
> TestAtomicOperation-output.trunk_120313.rar
>
>
> I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads 
> fails rarely.
> The solution is similar to HBASE-2856, where expired KVs are not collected 
> when in use by a scanner.
> ---
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-4542) add filter info to slow query logging

2012-03-12 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-4542:
--


Reopening so I won't forget about the 0.94 part.

> add filter info to slow query logging
> -
>
> Key: HBASE-4542
> URL: https://issues.apache.org/jira/browse/HBASE-4542
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.89.20100924
>Reporter: Kannan Muthukkaruppan
>Assignee: Madhuwanti Vaidya
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 
> 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
> Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
> D1263.2.patch, D1539.1.patch
>
>
> Slow query log doesn't report filters in effect.
> For example:
> {code}
> (operationTooSlow): \
> {"processingtimems":3468,"client":"10.138.43.206:40035","timeRange": 
> [0,9223372036854775807],\
> "starttimems":1317772005821,"responsesize":42411, \
> "class":"HRegionServer","table":"myTable","families":{"CF1":"ALL"]},\
> "row":"6c3b8efa132f0219b7621ed1e5c8c70b","queuetimems":0,\
> "method":"get","totalColumns":1,"maxVersions":1,"storeLimit":-1}
> {code}
> the above would suggest that all columns of myTable:CF1 are being requested 
> for the given row. But in reality there could be filters in effect (such as 
> ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
> enhance the slow query log to capture & report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5480) Fixups to MultithreadedTableMapper for Hadoop 0.23.2+

2012-03-09 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5480:
--


Reopening, so I won't forget about it.

> Fixups to MultithreadedTableMapper for Hadoop 0.23.2+
> -
>
> Key: HBASE-5480
> URL: https://issues.apache.org/jira/browse/HBASE-5480
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: HBASE-5480.patch
>
>
> There are two issues:
> - StatusReporter has a new method getProgress()
> - Mapper and reducer context objects can no longer be directly instantiated.
> See attached patch. I'm not thrilled with the added reflection but it was the 
> minimally intrusive change.
> Raised the priority to critical because compilation fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5229) Explore building blocks for "multi-row" local transactions.

2012-02-02 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5229:
--


Now that HBASE-5304 is committed, I would to rekindle the discussion here.

We have need to be able to co-locate some data for our tenant (for example all 
data of a specific user within a specific tenant space), in order to provide 
fast atomic operations.

With HBASE-5304 it is now possible to provide a split policy that (within the 
HBase constraints, of course) allows to guide the split process to co-locate 
parts of the data. What is missing is a small change on top of HBASE-3584 to 
allow atomic cross row operations.
See the first patch I have attached here - 5229.txt.
I would change the RegionMutation to MultiRowMutation, but in principle it 
would be the same. A very small, safe, incremental change.


> Explore building blocks for "multi-row" local transactions.
> ---
>
> Key: HBASE-5229
> URL: https://issues.apache.org/jira/browse/HBASE-5229
> Project: HBase
>  Issue Type: New Feature
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt
>
>
> HBase should provide basic building blocks for multi-row local transactions. 
> Local means that we do this by co-locating the data. Global (cross region) 
> transactions are not discussed here.
> After a bit of discussion two solutions have emerged:
> 1. Keep the row-key for determining grouping and location and allow efficient 
> intra-row scanning. A client application would then model tables as 
> HBase-rows.
> 2. Define a prefix-length in HTableDescriptor that defines a grouping of 
> rows. Regions will then never be split inside a grouping prefix.
> #1 is true to the current storage paradigm of HBase.
> #2 is true to the current client side API.
> I will explore these two with sample patches here.
> 
> Was:
> As discussed (at length) on the dev mailing list with the HBASE-3584 and 
> HBASE-5203 committed, supporting atomic cross row transactions within a 
> region becomes simple.
> I am aware of the hesitation about the usefulness of this feature, but we 
> have to start somewhere.
> Let's use this jira for discussion, I'll attach a patch (with tests) 
> momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-17 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-5058:
--


Actually, never mind. I have a simple patch now, that bandaids the problem.

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-4673) NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0

2011-10-26 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-4673:
--


You are right. Didn't realized that jgray had checked the cache config code in 
0.92 as well.

> NPE in HFileReaderV2.close during major compaction when 
> hfile.block.cache.size is set to 0 
> ---
>
> Key: HBASE-4673
> URL: https://issues.apache.org/jira/browse/HBASE-4673
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 4673.txt
>
>
> On a test system got this exception when hfile.block.cache.size is set to 0:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.close(HFileReaderV2.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1065)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:539)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.deleteReader(StoreFile.java:549)
> at 
> org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:1314)
> at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:686)
> at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1016)
> at 
> org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest.run(CompactionRequest.java:178)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619) 
> Minor issue as nobody in their right mind with have hfile.block.cache.size=0
> Looks like this is due to HBASE-4422

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-4488:
--


Reopening for the related change to Store.compactStore

> Store could miss rows during flush
> --
>
> Key: HBASE-4488
> URL: https://issues.apache.org/jira/browse/HBASE-4488
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4488.txt
>
>
> While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
> critical mistake:
> The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira