[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

2013-04-24 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640748#comment-13640748
 ] 

Karthik Ranganathan commented on HBASE-6874:


This has been implemented and checked in into the 0.89-fb branch.

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2013-01-25 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562811#comment-13562811
 ] 

Karthik Ranganathan commented on HBASE-6770:


Hey Terry, we're not working actively on the trunk port... 
[~saint@gmail.com] would be able to tell you if some is. If you are 
interested in trying to port the patch, I can definitely help out with reviews.

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Sub-task
  Components: Client, regionserver
Reporter: Karthik Ranganathan
Assignee: Chen Jin

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7478) Create a multi-threaded responder

2013-01-25 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562817#comment-13562817
 ] 

Karthik Ranganathan commented on HBASE-7478:


Interesting... I thought the processResponse(..., false) does not write to the 
channel when there are a lot of writes, only the processResponse(..., true) 
variant does. So in effect we are only single threaded when we are pumping out 
a lot of info using multiple connections.

 Create a multi-threaded responder
 -

 Key: HBASE-7478
 URL: https://issues.apache.org/jira/browse/HBASE-7478
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan

 Currently, we have multi-threaded readers and handlers, but a single threaded 
 responder which is a bottleneck.
 ipc.server.reader.count  : number of reader threads to read data off the wire
 ipc.server.handler.count : number of handler threads that process the request
 We need to have the ability to specify a ipc.server.responder.count to be 
 able to specify the number of responder threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-11 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551205#comment-13551205
 ] 

Karthik Ranganathan commented on HBASE-7477:


[~saint@gmail.com] Totally, feel free to open a new one for trunk.

Will definitely check out HBASE-7460. One other change that has happened in the 
past (which makes this easier) is that we have done away with proxy objects per 
conf on the client side - it used to be a singleton. Now with this patch, its 
just a straight up object instance.

 Remove Proxy instance from HBase RPC
 

 Key: HBASE-7477
 URL: https://issues.apache.org/jira/browse/HBASE-7477
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
 Attachments: 7477experiment.txt, HBASE-7477.patch


 Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
 the RPC parameters. This is pretty inefficient as it uses reflection to 
 lookup the current method name.
 The aim is to break up the proxy into an actual proxy implementation so that:
 1. we can make it more efficient by eliminating reflection
 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548688#comment-13548688
 ] 

Karthik Ranganathan commented on HBASE-5416:


I think the specific description (of making filters apply to only some CF's) is 
a good idea.But we continue down this path of generalizing filters, it could  
lead to an explosion of ad-hoc filters. In that case, it might be better to 
expose more co-processor hooks. Overall, +1 (only skimmed the changes though).

 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548711#comment-13548711
 ] 

Karthik Ranganathan commented on HBASE-7477:



 The pb Service 'fit' is not perfect though – it drags along some other stuff 
we do not want and it is missing a means of passing extra stuff unless we do 
some hackery – so reluctant to take it on though it does away with reflection. 

Couldn't agree more. My thought was that HBase only exposes simple API's like 
get, put, delete and scan. Each of these in turn takes in 1 object 
(Get/Put/Delete/Scan), and a couple of filters. The serialization of the latter 
objects already seems to be versioned. So protobufs might be expensive for just 
eliminating reflection, but it might help with the automatic versioning for 
future enhancements. I think you said the same thing here:  Given the above, 
protobuf Service starts to look better. It has kinks but would enforce a strong 
pattern – and we are most of the way there already with our use of the 
Service#BlockingInterface. 

I can do better than just explaining - can put up an initial patch that works 
for gets only. Will upload it next, but the changes are actually not very 
invasive. Here is an outline of steps:

- Replace the proxy with a HRegionInterfaceSerializerV1. It implements the RPC 
serialization when the method calls are made.
- On the server side, you would have the HRegionInterfaceDeserializerV1 object. 
You would use the method name to call the right function in this object, which 
deserializes the params. In the current incarnation, every method would do the 
same thing (read the params count, param classes, etc).
- Change the ser and deser objects to v2, bump up RPC version, substitute the 
method names for byte codes and make the serialization/deserialization of the 
params specific to the method that is called.

IMO, if you look at my next diff (where I reconstructed the HBase RPC 
protocol), its pretty verbose and inefficient. It roughly does the following:
* Get the class name, method name, num params, param classes by reflection
* write the class name (twice most of the time)
* write the num params
* write the types of each param
* then serialize each param
This makes coding nice, but hurts at the runtime perf.

 Remove Proxy instance from HBase RPC
 

 Key: HBASE-7477
 URL: https://issues.apache.org/jira/browse/HBASE-7477
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
 Attachments: 7477experiment.txt


 Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
 the RPC parameters. This is pretty inefficient as it uses reflection to 
 lookup the current method name.
 The aim is to break up the proxy into an actual proxy implementation so that:
 1. we can make it more efficient by eliminating reflection
 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-09 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-7477:
---

Attachment: HBASE-7477.patch

In this patch, HRegionInterfaceProxyImpl is the serializer that eliminates the 
proxy. This eliminates a decent chunk of CPU on the HBase client and entirely 
shifted the bottleneck to the server side. Was able to push the max get ops/sec 
to around 196K with this on the client and other server side changes. Will 
write up about this in detail sometime.

 Remove Proxy instance from HBase RPC
 

 Key: HBASE-7477
 URL: https://issues.apache.org/jira/browse/HBASE-7477
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
 Attachments: 7477experiment.txt, HBASE-7477.patch


 Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
 the RPC parameters. This is pretty inefficient as it uses reflection to 
 lookup the current method name.
 The aim is to break up the proxy into an actual proxy implementation so that:
 1. we can make it more efficient by eliminating reflection
 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548839#comment-13548839
 ] 

Karthik Ranganathan commented on HBASE-7477:


Yes was able to get it to 170-185K with HBASE-7100 and HBASE-7163 without this, 
and the client was the bottleneck. Now, its at 196K and the server seems to be 
the bottleneck.

 Remove Proxy instance from HBase RPC
 

 Key: HBASE-7477
 URL: https://issues.apache.org/jira/browse/HBASE-7477
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
 Attachments: 7477experiment.txt, HBASE-7477.patch


 Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
 the RPC parameters. This is pretty inefficient as it uses reflection to 
 lookup the current method name.
 The aim is to break up the proxy into an actual proxy implementation so that:
 1. we can make it more efficient by eliminating reflection
 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-02 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7477:
--

 Summary: Remove Proxy instance from HBase RPC
 Key: HBASE-7477
 URL: https://issues.apache.org/jira/browse/HBASE-7477
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan


Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize the 
RPC parameters. This is pretty inefficient as it uses reflection to lookup the 
current method name.

The aim is to break up the proxy into an actual proxy implementation so that:
1. we can make it more efficient by eliminating reflection
2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7478) Create a multi-threaded responder

2013-01-02 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7478:
--

 Summary: Create a multi-threaded responder
 Key: HBASE-7478
 URL: https://issues.apache.org/jira/browse/HBASE-7478
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan


Currently, we have multi-threaded readers and handlers, but a single threaded 
responder which is a bottleneck.

ipc.server.reader.count  : number of reader threads to read data off the wire
ipc.server.handler.count : number of handler threads that process the request

We need to have the ability to specify a ipc.server.responder.count to be 
able to specify the number of responder threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7163) Low-hanging perf improvements in HBase client

2012-11-15 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-7163:
---

Summary: Low-hanging perf  improvements in HBase client  (was: Change 
cachedRegionsLocations in HConnectionManager from SoftValueSortedMap to 
ConcurrentSkipListMap)

 Low-hanging perf  improvements in HBase client
 --

 Key: HBASE-7163
 URL: https://issues.apache.org/jira/browse/HBASE-7163
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 This change saves 15% CPU on the client side per profiling. In using the 
 ConcurrentSkipListMap, we can do:
 tableLocations.floorEntry(row).getValue()
 instead of doing:
 SortedMapbyte[], HRegionLocation matchingRegions =
 tableLocations.floorEntry(row).getValue();
 if (!matchingRegions.isEmpty()) {
   HRegionLocation possibleRegion = 
 matchingRegions.get(matchingRegions.lastKey());
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7163) Low-hanging perf improvements in HBase client

2012-11-15 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-7163:
---

Description: 

1. Change cachedRegionsLocations in HConnectionManager from SoftValueSortedMap 
to ConcurrentSkipListMap:
This change saves 15% CPU on the client side per profiling. In using the 
ConcurrentSkipListMap, we can do:
tableLocations.floorEntry(row).getValue()

instead of doing:
SortedMapbyte[], HRegionLocation matchingRegions =
tableLocations.floorEntry(row).getValue();
if (!matchingRegions.isEmpty()) {
  HRegionLocation possibleRegion = 
matchingRegions.get(matchingRegions.lastKey());
}


2. NetUtils.getDefaultSocketFactory is very inefficient, use 


  was:
This change saves 15% CPU on the client side per profiling. In using the 
ConcurrentSkipListMap, we can do:
tableLocations.floorEntry(row).getValue()

instead of doing:
SortedMapbyte[], HRegionLocation matchingRegions =
tableLocations.floorEntry(row).getValue();
if (!matchingRegions.isEmpty()) {
  HRegionLocation possibleRegion = 
matchingRegions.get(matchingRegions.lastKey());
}


 Low-hanging perf  improvements in HBase client
 --

 Key: HBASE-7163
 URL: https://issues.apache.org/jira/browse/HBASE-7163
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 1. Change cachedRegionsLocations in HConnectionManager from 
 SoftValueSortedMap to ConcurrentSkipListMap:
 This change saves 15% CPU on the client side per profiling. In using the 
 ConcurrentSkipListMap, we can do:
 tableLocations.floorEntry(row).getValue()
 instead of doing:
 SortedMapbyte[], HRegionLocation matchingRegions =
 tableLocations.floorEntry(row).getValue();
 if (!matchingRegions.isEmpty()) {
   HRegionLocation possibleRegion = 
 matchingRegions.get(matchingRegions.lastKey());
 }
 2. NetUtils.getDefaultSocketFactory is very inefficient, use 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7163) Low-hanging perf improvements in HBase client

2012-11-15 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498416#comment-13498416
 ] 

Karthik Ranganathan commented on HBASE-7163:


@Ted - yes that's the part, fix has another component to it though. Also, 
changed this task to add one more perf improvement.

 Low-hanging perf  improvements in HBase client
 --

 Key: HBASE-7163
 URL: https://issues.apache.org/jira/browse/HBASE-7163
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 1. Change cachedRegionsLocations in HConnectionManager from 
 SoftValueSortedMap to ConcurrentSkipListMap:
 This change saves 15% CPU on the client side per profiling. In using the 
 ConcurrentSkipListMap, we can do:
 tableLocations.floorEntry(row).getValue()
 instead of doing:
 SortedMapbyte[], HRegionLocation matchingRegions =
 tableLocations.floorEntry(row).getValue();
 if (!matchingRegions.isEmpty()) {
   HRegionLocation possibleRegion = 
 matchingRegions.get(matchingRegions.lastKey());
 }
 2. NetUtils.getDefaultSocketFactory is very inefficient, use 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7163) Low-hanging perf improvements in HBase client

2012-11-15 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498501#comment-13498501
 ] 

Karthik Ranganathan commented on HBASE-7163:


Yes, thanks for explicitly mentioning, forgot to mention that point. The 
thought was that the overhead of caching all regions would not be too large.

 Low-hanging perf  improvements in HBase client
 --

 Key: HBASE-7163
 URL: https://issues.apache.org/jira/browse/HBASE-7163
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 1. Change cachedRegionsLocations in HConnectionManager from 
 SoftValueSortedMap to ConcurrentSkipListMap:
 This change saves 15% CPU on the client side per profiling. In using the 
 ConcurrentSkipListMap, we can do:
 tableLocations.floorEntry(row).getValue()
 instead of doing:
 SortedMapbyte[], HRegionLocation matchingRegions =
 tableLocations.floorEntry(row).getValue();
 if (!matchingRegions.isEmpty()) {
   HRegionLocation possibleRegion = 
 matchingRegions.get(matchingRegions.lastKey());
 }
 2. NetUtils.getDefaultSocketFactory is very inefficient, use 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7163) Change cachedRegionsLocations in HConnectionManager from SoftValueSortedMap to ConcurrentSkipListMap

2012-11-14 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7163:
--

 Summary: Change cachedRegionsLocations in HConnectionManager from 
SoftValueSortedMap to ConcurrentSkipListMap
 Key: HBASE-7163
 URL: https://issues.apache.org/jira/browse/HBASE-7163
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


This change saves 15% CPU on the client side per profiling. In using the 
ConcurrentSkipListMap, we can do:
tableLocations.floorEntry(row).getValue()

instead of doing:
SortedMapbyte[], HRegionLocation matchingRegions =
tableLocations.floorEntry(row).getValue();
if (!matchingRegions.isEmpty()) {
  HRegionLocation possibleRegion = 
matchingRegions.get(matchingRegions.lastKey());
}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

2012-11-06 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491695#comment-13491695
 ] 

Karthik Ranganathan commented on HBASE-6874:


Lars - the dependency on HBASE-6770 is more to make the code simpler. 
Currently, the HRegionServer loops over numRows, and the RegionScanner loops 
over the columns in the various CF's but for one row. HBASE-6770 will move the 
looping on the numRows into the RegionScanner itself, because we need to track 
both memory size and number of rows - in order to respect the more restrictive 
of the two. Once that happens, we can implement prefetching in the 
RegionScanner itself, instead of spreading the logic in HRegionServer also. So 
more of a code-simplicity and not having to resolve conflicts thing.

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7105) RS throws NPE on forcing compaction from HBase shell on a single bulk imported file.

2012-11-06 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7105:
--

 Summary: RS throws NPE on forcing compaction from HBase shell on a 
single bulk imported file.
 Key: HBASE-7105
 URL: https://issues.apache.org/jira/browse/HBASE-7105
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


In StoreFile, we have:
private AtomicBoolean majorCompaction = null;

In StoreFile.open(), we do:
b = metadataMap.get(MAJOR_COMPACTION_KEY);
if (b != null) {
  // init majorCompaction variable
}

Because the file was bulk imported, this is never initialized. Any subsequent 
call to isMajorCompaction() NPE's.

Fix is to initialize it to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

2012-11-05 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490898#comment-13490898
 ] 

Karthik Ranganathan commented on HBASE-6874:


Awesome, then layering in multi-pre-fetch should be very easy!

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7100) Allow multiple connections from HBaseClient to each remote endpoint

2012-11-05 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7100:
--

 Summary: Allow multiple connections from HBaseClient to each 
remote endpoint
 Key: HBASE-7100
 URL: https://issues.apache.org/jira/browse/HBASE-7100
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


Allowing multiple connections gives a *huge* boost while benchmarking 
performance. In a production setup, many nodes query a single regionserver. But 
one connection is not enough for a single HBase client to push a single 
regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488683#comment-13488683
 ] 

Karthik Ranganathan commented on HBASE-6925:


Where is the chunking (that JIRA had a lot of stuff to parse)? Right now, in 
89-fb, the client's nio send buffers are at 128K, and the input stream that 
reads from the nio buffer is only 8K. This change is on the server side. I 
would hypothesize that scan (which return a lot of data) will benefit from this.

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488687#comment-13488687
 ] 

Karthik Ranganathan commented on HBASE-6925:


Also, I was able to get the throughput of a single-threaded scan from a client 
of a block in the block-cache break 100MB/s (on whatever SKU I am using) - 
started around the 20MB/s. Will write a blog post about the various changes in 
detail if interested. I think there is scope to do even better. Prefetching is 
huge of-course - helps even more in case the block has to be read from disk.

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488689#comment-13488689
 ] 

Karthik Ranganathan commented on HBASE-6874:


Actually did this analysis and enhancement for an online analytics use-case as 
well (and search indexing), and most of what you say maps one to one. The only 
difference I guess is that so far we are not heavily relying on server side 
filtering much, so decided on punting on the prefetching=n case for now (we 
actually discussed this).

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488991#comment-13488991
 ] 

Karthik Ranganathan commented on HBASE-6925:


No, I dont think that would matter, this is more about the socket transfer size 
into an underlying buffer.

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489042#comment-13489042
 ] 

Karthik Ranganathan commented on HBASE-6925:


Go for the commit Lars!

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-10-31 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488459#comment-13488459
 ] 

Karthik Ranganathan commented on HBASE-6925:


Missed Stack's question - this complex change alone gave a 25% best case 
improvement in scan throughput.

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

2012-10-31 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488463#comment-13488463
 ] 

Karthik Ranganathan commented on HBASE-6874:


Thought about the N scanners, its a complicated change - you would have to 
change the entire scan protocol. Each of the next calls in scans are not 
numbered, and so you could go out of whack if prefetching N (and throw in 
exceptions). There is also the basic issue right now that scans do retries 
which is wrong. Also, reasoning about it another way, if your in memory scan 
throughput is  the time to read from disk, you're probably good. I found that 
there are other unrelated bottlenecks preventing this from being the case. Of 
course, if the filtering is very heavy then this will breakdown... you probably 
want to implement prefetching based on the num filtered rows, which should not 
be too hard.

I have a patch I have tested with, but its waiting on HBASE-6770 - that is 
going to refactor scans quite a bit. Will put a patch out once that is done.

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7068) Create a Get benchmark

2012-10-29 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7068:
--

 Summary: Create a Get benchmark
 Key: HBASE-7068
 URL: https://issues.apache.org/jira/browse/HBASE-7068
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7067) HBase Get perf improvements

2012-10-29 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7067:
--

 Summary: HBase Get perf improvements
 Key: HBASE-7067
 URL: https://issues.apache.org/jira/browse/HBASE-7067
 Project: HBase
  Issue Type: Umbrella
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


Umbrella task for improving Get performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7026) Make metrics collection in StoreScanner.java more efficient

2012-10-22 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7026:
--

 Summary: Make metrics collection in StoreScanner.java more 
efficient
 Key: HBASE-7026
 URL: https://issues.apache.org/jira/browse/HBASE-7026
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


Per the benchmarks I ran, the following block of code seems to be inefficient:
StoreScanner.java:
public synchronized boolean next(ListKeyValue outResult, int limit,
  String metric) throws IOException {
// ...
  // update the counter 
  if (addedResultsSize  0  metric != null) {
HRegion.incrNumericMetric(this.metricNamePrefix + metric, 
addedResultsSize);
  }
// ...

Removing this block increased throughput by 10%. We should move this to the 
outer layer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7029) Result array serialization improvements

2012-10-22 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-7029:
--

 Summary: Result array serialization improvements
 Key: HBASE-7029
 URL: https://issues.apache.org/jira/browse/HBASE-7029
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


The Result[] is very inefficiently serialized - there are 2 for loops over each 
result and we instantiate every object.

A better way is to make it a data block, and use delta block encoding to make 
it more efficient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-10-19 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480344#comment-13480344
 ] 

Karthik Ranganathan commented on HBASE-6925:


Yes, this is committed into 0.89.fb already, its a super-trivial change but 
does improve perf quite a bit.

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBASE-6925.patch


 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6923) Create scanner benchmark

2012-10-19 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480347#comment-13480347
 ] 

Karthik Ranganathan commented on HBASE-6923:


I am still in the process of improving scan performance, so still work in 
progress (though I have made an initial commit). Am planning on more 
modifications to this.

 Create scanner benchmark
 

 Key: HBASE-6923
 URL: https://issues.apache.org/jira/browse/HBASE-6923
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: TestStorePerformance.java


 Create a simple program to benchmark performance/throughput of scanners, and 
 print some results at the end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5783) Faster HBase bulk loader

2012-10-16 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477137#comment-13477137
 ] 

Karthik Ranganathan commented on HBASE-5783:


No, we track only the last (highest) one per region. Also, in the actual 
implementation, we did it with just timestamps from the RS. So, after doing all 
the puts the loader gets the time on the RS (t1). The server tracks the start 
time of the last successfully completed flush {t2). Querying that and making 
sure t2  t1 is enough. Of course - if the region has moved gracefully, thats 
considered a success too as an optimization.

We used the term MR Bulk Loader simply to say that the load of the data 
should be repeatable in case of failure (as opposed to a online use case).

 Faster HBase bulk loader
 

 Key: HBASE-5783
 URL: https://issues.apache.org/jira/browse/HBASE-5783
 Project: HBase
  Issue Type: New Feature
  Components: Client, IPC/RPC, Performance, regionserver
Reporter: Karthik Ranganathan
Assignee: Amitanand Aiyer

 We can get a 3x to 4x gain based on a prototype demonstrating this approach 
 in effect (hackily) over the MR bulk loader for very large data sets by doing 
 the following:
 1. Do direct multi-puts from HBase client using GZIP compressed RPC's
 2. Turn off WAL (we will ensure no data loss in another way)
 3. For each bulk load client, we need to:
 3.1 do a put
 3.2 get back a tracking cookie (memstoreTs or HLogSequenceId) per put
 3.3 be able to ask the RS if the tracking cookie has been flushed to disk
 4. For each client, we can succeed it if the tracking cookie for the last put 
 it did (for every RS) makes it to disk. Otherwise the map task fails and is 
 retried.
 5. If the last put did not make it to disk for a timeout (say a second or so) 
 we issue a manual flush.
 Enhancements:
 - Increase the memstore size so that we flush larger files
 - Decrease the compaction ratios (say increase the number of files to compact)
 Quick background:
 The bottlenecks in the multiput approach are that the data is transferred 
 *uncompressed* twice over the top-of-rack: once from the client to the RS (on 
 the multi put call) and again because of WAL (HDFS replication). We reduced 
 the former with RPC compression and eliminated the latter above while still 
 guaranteeing that data wont be lost.
 This is better than the MR bulk loader at a high level because we dont need 
 to merge sort all the files for a given region and then make it a HFile - 
 thats the equivalent of bulk loading AND majorcompacting in one shot. Also 
 there is much more disk involved in the MR method (sort/spill).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

2012-10-16 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477194#comment-13477194
 ] 

Karthik Ranganathan commented on HBASE-6980:


@ramakrishna - this should not be necessary for ensuring no data loss right? 
Once we have a snapshot memstore, we automatically should know the max seq id 
to which it has data - that would never change.

1. From what I remember of the code (when I was looking into something 
unrelated), we track the *min* seq id from the current memstore instead of the 
max seq id from the snapshot memstore to put into the HLog when its rolled 
after a flush. So this synchronization becomes necessary - if we store the max 
seq id along with the memstore that is flushed, we should be able to eliminate 
the locks.

2. Also, its arguable if we need the absolute correct max-seq-id flushed. In a 
very small % of cases, we would end up rolling logs a bit slower. As long as we 
are conservative with updating the max seq id in the HLog we should be good, 
right?

 Parallel Flushing Of Memstores
 --

 Key: HBASE-6980
 URL: https://issues.apache.org/jira/browse/HBASE-6980
 Project: HBase
  Issue Type: New Feature
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 For write dominated workloads, single threaded memstore flushing is an 
 unnecessary bottleneck. With a single flusher thread, we are basically not 
 setup to take advantage of the aggregate throughput that multi-disk nodes 
 provide.
 * For puts with WAL enabled, the bottleneck is more likely the single WAL 
 per region server. So this particular fix may not buy as much unless we 
 unlock that bottleneck with multiple commit logs per region server. (Topic 
 for a separate JIRA-- HBASE-6981).
 * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk 
 imports), we should be able to support much better ingest rates with parallel 
 flushing of memstores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6619) Do no unregister and re-register interest ops in RPC

2012-10-04 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13469802#comment-13469802
 ] 

Karthik Ranganathan commented on HBASE-6619:


No this is fine, its already committed into 89-fb.

 Do no unregister and re-register interest ops in RPC
 

 Key: HBASE-6619
 URL: https://issues.apache.org/jira/browse/HBASE-6619
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC, Performance
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk
Priority: Critical
 Attachments: 
 0001-jira-HBASE-6619-89-fb-Do-no-unregister-and-re-regist.patch


 While investigating perf of HBase, Michal noticed that we could cut about 
 5-40% (depending on number of threads) from the total get time in the RPC on 
 the server side if we eliminated re-registering for interest ops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6923) Create scanner benchmark

2012-10-02 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6923:
--

 Summary: Create scanner benchmark
 Key: HBASE-6923
 URL: https://issues.apache.org/jira/browse/HBASE-6923
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


Create a simple program to benchmark performance/throughput of scanners, and 
print some results at the end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6922) HBase scanner performance improvements

2012-10-02 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6922:
--

 Summary: HBase scanner performance improvements
 Key: HBASE-6922
 URL: https://issues.apache.org/jira/browse/HBASE-6922
 Project: HBase
  Issue Type: Umbrella
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


Umbrella task for improving through in HBase scanners.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6923) Create scanner benchmark

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6923:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HBASE-6922

 Create scanner benchmark
 

 Key: HBASE-6923
 URL: https://issues.apache.org/jira/browse/HBASE-6923
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 Create a simple program to benchmark performance/throughput of scanners, and 
 print some results at the end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6874) Implement prefetching for scanners

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6874:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HBASE-6922

 Implement prefetching for scanners
 --

 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 I did some quick experiments by scanning data that should be completely in 
 memory and found that adding pre-fetching increases the throughput by about 
 50% from 26MB/s to 39MB/s.
 The idea is to perform the next in a background thread, and keep the result 
 ready. When the scanner's next comes in, return the pre-computed result and 
 issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6770:
---

Issue Type: Sub-task  (was: Bug)
Parent: HBASE-6922

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Sub-task
  Components: Client, regionserver
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6066) some low hanging read path improvement ideas

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6066:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HBASE-6922

 some low hanging read path improvement ideas 
 -

 Key: HBASE-6066
 URL: https://issues.apache.org/jira/browse/HBASE-6066
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
Priority: Critical
  Labels: noob
 Fix For: 0.96.0

 Attachments: 
 0001-jira-HBASE-6066-89-fb-Some-read-performance-improvem.patch, 
 metric-stringbuilder-fix.patch


 I was running some single threaded scan performance tests for a table with 
 small sized rows that is fully cached. Some observations...
 We seem to be doing several wasteful iterations over and/or building of 
 temporary lists.
 1) One such is the following code in HRegionServer.next():
 {code}
boolean moreRows = s.next(values, HRegion.METRIC_NEXTSIZE);
if (!values.isEmpty()) {
  for (KeyValue kv : values) {  --  wasteful in most 
 cases
currentScanResultSize += kv.heapSize();
}
results.add(new Result(values));
 {code}
 By default the maxScannerResultSize is Long.MAX_VALUE. In those cases,
 we can avoid the unnecessary iteration to compute currentScanResultSize.
 2) An example of a wasteful temporary array, is results in
 RegionScanner.next().
 {code}
   results.clear();
   boolean returnResult = nextInternal(limit, metric);
   outResults.addAll(results);
 {code}
 results then gets copied over to outResults via an addAll(). Not sure why we 
 can not directly collect the results in outResults.
 3) Another almost similar exmaple of a wasteful array is results in 
 StoreScanner.next(), which eventually also copies its results into 
 outResults.
 4) Reduce overhead of size metric maintained in StoreScanner.next().
 {code}
   if (metric != null) {
  HRegion.incrNumericMetric(this.metricNamePrefix + metric,
copyKv.getLength());
   }
   results.add(copyKv);
 {code}
 A single call to next() might fetch a lot of KVs. We can first add up the 
 size of those KVs in a local variable and then in a finally clause increment 
 the metric one shot, rather than updating AtomicLongs for each KV.
 5) RegionScanner.next() calls a helper RegionScanner.next() on the same 
 object. Both are synchronized methods. Synchronized methods calling nested 
 synchronized methods on the same object are probably adding some small 
 overhead. The inner next() calls isFilterDone() which is a also a 
 synchronized method. We should factor the code to avoid these nested 
 synchronized methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6923) Create scanner benchmark

2012-10-02 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468039#comment-13468039
 ] 

Karthik Ranganathan commented on HBASE-6923:


Hey Todd, nice! I too have written a benchmark with interesting results. Would 
be interesting to compare :)

 Create scanner benchmark
 

 Key: HBASE-6923
 URL: https://issues.apache.org/jira/browse/HBASE-6923
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: TestStorePerformance.java


 Create a simple program to benchmark performance/throughput of scanners, and 
 print some results at the end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-10-02 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6925:
--

 Summary: Change socket write size from 8K to 64K for HBaseServer
 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan


Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6925:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HBASE-6922

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-6925:
--

Assignee: Karthik Ranganathan

 Change socket write size from 8K to 64K for HBaseServer
 ---

 Key: HBASE-6925
 URL: https://issues.apache.org/jira/browse/HBASE-6925
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
 from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-10-02 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-6770:
--

Assignee: Chen Jin  (was: Michal Gregorczyk)

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Sub-task
  Components: Client, regionserver
Reporter: Karthik Ranganathan
Assignee: Chen Jin

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6874) Implement prefetching for scanners

2012-09-24 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6874:
--

 Summary: Implement prefetching for scanners
 Key: HBASE-6874
 URL: https://issues.apache.org/jira/browse/HBASE-6874
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


I did some quick experiments by scanning data that should be completely in 
memory and found that adding pre-fetching increases the throughput by about 50% 
from 26MB/s to 39MB/s.

The idea is to perform the next in a background thread, and keep the result 
ready. When the scanner's next comes in, return the pre-computed result and 
issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-09-18 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13457823#comment-13457823
 ] 

Karthik Ranganathan commented on HBASE-6770:


Yes, good estimate is the intention. Across different use-cases (or sometimes 
different column families in the same table), the kv sizes are so different it 
gets hard to come up with good estimates that would not OOM the client in all 
cases.

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5783) Faster HBase bulk loader

2012-09-18 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-5783:
---

Assignee: Amitanand Aiyer  (was: Nicolas Spiegelberg)

 Faster HBase bulk loader
 

 Key: HBASE-5783
 URL: https://issues.apache.org/jira/browse/HBASE-5783
 Project: HBase
  Issue Type: New Feature
  Components: client, ipc, performance, regionserver
Reporter: Karthik Ranganathan
Assignee: Amitanand Aiyer

 We can get a 3x to 4x gain based on a prototype demonstrating this approach 
 in effect (hackily) over the MR bulk loader for very large data sets by doing 
 the following:
 1. Do direct multi-puts from HBase client using GZIP compressed RPC's
 2. Turn off WAL (we will ensure no data loss in another way)
 3. For each bulk load client, we need to:
 3.1 do a put
 3.2 get back a tracking cookie (memstoreTs or HLogSequenceId) per put
 3.3 be able to ask the RS if the tracking cookie has been flushed to disk
 4. For each client, we can succeed it if the tracking cookie for the last put 
 it did (for every RS) makes it to disk. Otherwise the map task fails and is 
 retried.
 5. If the last put did not make it to disk for a timeout (say a second or so) 
 we issue a manual flush.
 Enhancements:
 - Increase the memstore size so that we flush larger files
 - Decrease the compaction ratios (say increase the number of files to compact)
 Quick background:
 The bottlenecks in the multiput approach are that the data is transferred 
 *uncompressed* twice over the top-of-rack: once from the client to the RS (on 
 the multi put call) and again because of WAL (HDFS replication). We reduced 
 the former with RPC compression and eliminated the latter above while still 
 guaranteeing that data wont be lost.
 This is better than the MR bulk loader at a high level because we dont need 
 to merge sort all the files for a given region and then make it a HFile - 
 thats the equivalent of bulk loading AND majorcompacting in one shot. Also 
 there is much more disk involved in the MR method (sort/spill).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-09-17 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13457179#comment-13457179
 ] 

Karthik Ranganathan commented on HBASE-6770:


Agreed. If that's the only issue, then passing a hint makes it easier to use - 
do something like setPartialRowScanning(true) if we want to respect that. But 
in any case, I am not suggesting removing the existing API, just adding the new 
ones.

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Reporter: Karthik Ranganathan

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-09-17 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6770:
---

Assignee: Michal Gregorczyk

 Allow scanner setCaching to specify size instead of number of rows
 --

 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk

 Currently, we have the following api's to customize the behavior of scans:
 setCaching() - how many rows to cache on client to speed up scans
 setBatch() - max columns per row to return per row to prevent a very large 
 response.
 Ideally, we should be able to specify a memory buffer size because:
 1. that would take care of both of these use cases.
 2. it does not need any knowledge of the size of the rows or cells, as the 
 final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6770) Allow scanner setCaching to specify size instead of number of rows

2012-09-12 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6770:
--

 Summary: Allow scanner setCaching to specify size instead of 
number of rows
 Key: HBASE-6770
 URL: https://issues.apache.org/jira/browse/HBASE-6770
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Reporter: Karthik Ranganathan


Currently, we have the following api's to customize the behavior of scans:
setCaching() - how many rows to cache on client to speed up scans
setBatch() - max columns per row to return per row to prevent a very large 
response.

Ideally, we should be able to specify a memory buffer size because:
1. that would take care of both of these use cases.
2. it does not need any knowledge of the size of the rows or cells, as the 
final thing we are worried about is the available memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6619) Do no unregister and re-register interest ops in RPC

2012-08-20 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6619:
--

 Summary: Do no unregister and re-register interest ops in RPC
 Key: HBASE-6619
 URL: https://issues.apache.org/jira/browse/HBASE-6619
 Project: HBase
  Issue Type: Bug
  Components: ipc
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk


While investigating perf of HBase, Michal noticed that we could cut about 5-40% 
(depending on number of threads) from the total get time in the RPC on the 
server side if we eliminated re-registering for interest ops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6583) Enhance Hbase load test tool to automatically create cf's if not present

2012-08-14 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6583:
--

 Summary: Enhance Hbase load test tool to automatically create cf's 
if not present
 Key: HBASE-6583
 URL: https://issues.apache.org/jira/browse/HBASE-6583
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


The load test tool currently disables the table and applies any changes to the 
cf descriptor if any, but does not create the cf if not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6583) Enhance Hbase load test tool to automatically create cf's if not present

2012-08-14 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-6583:
---

Assignee: (was: Karthik Ranganathan)

 Enhance Hbase load test tool to automatically create cf's if not present
 

 Key: HBASE-6583
 URL: https://issues.apache.org/jira/browse/HBASE-6583
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Karthik Ranganathan

 The load test tool currently disables the table and applies any changes to 
 the cf descriptor if any, but does not create the cf if not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6578) Make HDFS block size configurable for HBase WAL

2012-08-13 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6578:
--

 Summary: Make HDFS block size configurable for HBase WAL
 Key: HBASE-6578
 URL: https://issues.apache.org/jira/browse/HBASE-6578
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan


Right now, because sync-on-block-close is enabled, HLog causes the disk to 
stall out on large writes (esp when we cross block boundary).

We currently use 256MB blocks. The idea is that if we use smaller block sizes, 
we should be able to spray the data across more disks (because of round robin 
scheduling) and this would cause more uniform disk usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6578) Make HDFS block size configurable for HBase WAL

2012-08-13 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-6578:
--

Assignee: Li Pi

 Make HDFS block size configurable for HBase WAL
 ---

 Key: HBASE-6578
 URL: https://issues.apache.org/jira/browse/HBASE-6578
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Li Pi

 Right now, because sync-on-block-close is enabled, HLog causes the disk to 
 stall out on large writes (esp when we cross block boundary).
 We currently use 256MB blocks. The idea is that if we use smaller block 
 sizes, we should be able to spray the data across more disks (because of 
 round robin scheduling) and this would cause more uniform disk usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6486) Enhance load test to print throughput measurements

2012-07-31 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6486:
--

 Summary: Enhance load test to print throughput measurements
 Key: HBASE-6486
 URL: https://issues.apache.org/jira/browse/HBASE-6486
 Project: HBase
  Issue Type: Bug
Reporter: Karthik Ranganathan
Assignee: Aurick Qiao


Idea is to know how many MB/sec of throughput we are able to get by writing 
into HBase using a simple tool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6423) Writes should not block reads on blocking updates to memstores

2012-07-18 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6423:
--

 Summary: Writes should not block reads on blocking updates to 
memstores
 Key: HBASE-6423
 URL: https://issues.apache.org/jira/browse/HBASE-6423
 Project: HBase
  Issue Type: Bug
Reporter: Karthik Ranganathan
Assignee: Amitanand Aiyer


We have a big data use case where we turn off WAL and have a ton of reads and 
writes. We found that:

1. flushing a memstore takes a while (GZIP compression)
2. incoming writes cause the new memstore to grow in an unbounded fashion
3. this triggers blocking memstore updates
4. in turn, this causes all the RPC handler threads to block on writes to that 
memstore
5. we are not able to read during this time as RPC handlers are blocked

At a higher level, we should not hold up the RPC threads while blocking 
updates, and we should build in some sort of rate control.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6066) some low hanging read path improvement ideas

2012-07-11 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-6066:
--

Assignee: Michal Gregorczyk  (was: Aurick Qiao)

 some low hanging read path improvement ideas 
 -

 Key: HBASE-6066
 URL: https://issues.apache.org/jira/browse/HBASE-6066
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
Priority: Critical
  Labels: noob
 Attachments: metric-stringbuilder-fix.patch


 I was running some single threaded scan performance tests for a table with 
 small sized rows that is fully cached. Some observations...
 We seem to be doing several wasteful iterations over and/or building of 
 temporary lists.
 1) One such is the following code in HRegionServer.next():
 {code}
boolean moreRows = s.next(values, HRegion.METRIC_NEXTSIZE);
if (!values.isEmpty()) {
  for (KeyValue kv : values) {  --  wasteful in most 
 cases
currentScanResultSize += kv.heapSize();
}
results.add(new Result(values));
 {code}
 By default the maxScannerResultSize is Long.MAX_VALUE. In those cases,
 we can avoid the unnecessary iteration to compute currentScanResultSize.
 2) An example of a wasteful temporary array, is results in
 RegionScanner.next().
 {code}
   results.clear();
   boolean returnResult = nextInternal(limit, metric);
   outResults.addAll(results);
 {code}
 results then gets copied over to outResults via an addAll(). Not sure why we 
 can not directly collect the results in outResults.
 3) Another almost similar exmaple of a wasteful array is results in 
 StoreScanner.next(), which eventually also copies its results into 
 outResults.
 4) Reduce overhead of size metric maintained in StoreScanner.next().
 {code}
   if (metric != null) {
  HRegion.incrNumericMetric(this.metricNamePrefix + metric,
copyKv.getLength());
   }
   results.add(copyKv);
 {code}
 A single call to next() might fetch a lot of KVs. We can first add up the 
 size of those KVs in a local variable and then in a finally clause increment 
 the metric one shot, rather than updating AtomicLongs for each KV.
 5) RegionScanner.next() calls a helper RegionScanner.next() on the same 
 object. Both are synchronized methods. Synchronized methods calling nested 
 synchronized methods on the same object are probably adding some small 
 overhead. The inner next() calls isFilterDone() which is a also a 
 synchronized method. We should factor the code to avoid these nested 
 synchronized methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6360) Thrift proxy does not emit runtime metrics

2012-07-09 Thread Karthik Ranganathan (JIRA)
Karthik Ranganathan created HBASE-6360:
--

 Summary: Thrift proxy does not emit runtime metrics
 Key: HBASE-6360
 URL: https://issues.apache.org/jira/browse/HBASE-6360
 Project: HBase
  Issue Type: Bug
  Components: thrift
Reporter: Karthik Ranganathan
Assignee: Michal Gregorczyk


Open jconsole against a thrift proxy, and you will not find the rumtime stats 
that it should be exporting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)

2012-06-19 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13397163#comment-13397163
 ] 

Karthik Ranganathan commented on HBASE-5509:


I know :) but I dont get the reason though. Going to put in a couple of 
comments more, but if its a no go - then oh well.

 MR based copier for copying HFiles (trunk version)
 --

 Key: HBASE-5509
 URL: https://issues.apache.org/jira/browse/HBASE-5509
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Lars Hofhansl
 Attachments: 5509-v2.txt, 5509.txt


 This copier is a modification of the distcp tool in HDFS. It does the 
 following:
 1. List out all the regions in the HBase cluster for the required table
 2. Write the above out to a file
 3. Each mapper 
3.1 lists all the HFiles for a given region by querying the regionserver
3.2 copies all the HFiles
3.3 outputs success if the copy succeeded, failure otherwise. Failed 
 regions are retried in another loop
 4. Mappers are placed on nodes which have maximum locality for a given region 
 to speed up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)

2012-06-18 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396510#comment-13396510
 ] 

Karthik Ranganathan commented on HBASE-5509:


@Lars - I ripped out some code which used the hardlinking - we have implemented 
it internally. I believe we are planning on opensourcing this, otherwise you'd 
have to wait for native hardlinks. The current copy approach still works though 
for a few tens of TB's.

 MR based copier for copying HFiles (trunk version)
 --

 Key: HBASE-5509
 URL: https://issues.apache.org/jira/browse/HBASE-5509
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Lars Hofhansl
 Attachments: 5509-v2.txt, 5509.txt


 This copier is a modification of the distcp tool in HDFS. It does the 
 following:
 1. List out all the regions in the HBase cluster for the required table
 2. Write the above out to a file
 3. Each mapper 
3.1 lists all the HFiles for a given region by querying the regionserver
3.2 copies all the HFiles
3.3 outputs success if the copy succeeded, failure otherwise. Failed 
 regions are retried in another loop
 4. Mappers are placed on nodes which have maximum locality for a given region 
 to speed up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4667) Importer for exported tables

2012-05-31 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan resolved HBASE-4667.


Resolution: Duplicate
  Assignee: Karthik Ranganathan

This is already covered by HBASE-5509 (trunk version) and HBASE-4663 (89-fb 
version)

 Importer for exported tables
 

 Key: HBASE-4667
 URL: https://issues.apache.org/jira/browse/HBASE-4667
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 Once HBase tables are backed up to a well known location, we need to be able 
 to import them. A few flavors need to be supported here:
 1. Running cluster or a cluster that is not up and running
 2. Same tablename or a different one

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4655) Document architecture of backups

2012-05-31 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286809#comment-13286809
 ] 

Karthik Ranganathan commented on HBASE-4655:


I think we should add this doc to the HBase book. The code parts of this HBase 
backups feature is already done. I think the next step is to implement a simple 
wrapper script, and document that as well.

The tasks are already created, see HBASE-4618 for a list of sub-tasks (tasks 1, 
2, 4 and 6 are done, 4 needs to be checked in and closed out).

The next one to look at would be HBASE-4664. Let me add some comments in there 
about what we came up with internally, and then we can go ahead from there.

 Document architecture of backups
 

 Key: HBASE-4655
 URL: https://issues.apache.org/jira/browse/HBASE-4655
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBase Backups Architecture v2.docx, HBase Backups 
 Architecture.docx


 Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4655) Document architecture of backups

2012-05-23 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281684#comment-13281684
 ] 

Karthik Ranganathan commented on HBASE-4655:


Marking as resolved, feel free to send more comments my way in case something 
is not clear.

 Document architecture of backups
 

 Key: HBASE-4655
 URL: https://issues.apache.org/jira/browse/HBASE-4655
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBase Backups Architecture v2.docx, HBase Backups 
 Architecture.docx


 Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4655) Document architecture of backups

2012-05-23 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan resolved HBASE-4655.


Resolution: Fixed

 Document architecture of backups
 

 Key: HBASE-4655
 URL: https://issues.apache.org/jira/browse/HBASE-4655
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBase Backups Architecture v2.docx, HBase Backups 
 Architecture.docx


 Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4663) MR based copier for copying HFiles

2012-05-23 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281685#comment-13281685
 ] 

Karthik Ranganathan commented on HBASE-4663:


See https://reviews.facebook.net/D1965 for the diff. Also, see HBASE-5509 for 
the trunk version.

 MR based copier for copying HFiles
 --

 Key: HBASE-4663
 URL: https://issues.apache.org/jira/browse/HBASE-4663
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 This copier is a modification of the distcp tool in HDFS. It does the 
 following:
 1. List out all the regions in the HBase cluster for the required table
 2. Write the above out to a file
 3. Each mapper 
3.1 lists all the HFiles for a given region by querying the regionserver
3.2 copies all the HFiles
3.3 outputs success if the copy succeeded, failure otherwise. Failed 
 regions are retried in another loop
 4. Mappers are placed on nodes which have maximum locality for a given region 
 to speed up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4463) Run more aggressive compactions during off peak hours

2011-09-23 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13113847#comment-13113847
 ] 

Karthik Ranganathan commented on HBASE-4463:


@Stack - we can find the exact amount of data we are writing to the dfs (only 
hfile blocks will contribute to this during compactions). So adding a threshold 
like this is not too hard... but there could be disk iops pressure (instead of 
network bandwidth) and detecting that would be hard. So we would still need to 
set off-peak time.

I was trying to come up with a more generic solution but that involves setting 
up a feedback loop inside the regionserver - keep track of max, min and average 
latencies over the last k days (would have to store this in META or some other 
location as it needs to persist beyond restarts). Need to remove any spikes in 
the values. When we run an aggressive compaction, we need to make sure the 
latencies are still acceptable, otherwise dont run aggressive compactions. This 
is much harder to get right though.

 Run more aggressive compactions during off peak hours
 -

 Key: HBASE-4463
 URL: https://issues.apache.org/jira/browse/HBASE-4463
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 The number of iops on the disk and the top of the rack bandwidth utilization 
 at off peak hours is much lower than at peak hours depending on the 
 application usage pattern. We can utilize this knowledge to improve the 
 performance of the HBase cluster by increasing the compact selection ratio to 
 a much larger value during off-peak hours than otherwise - increasing 
 hbase.hstore.compaction.ratio (1.2 default) to 
 hbase.hstore.compaction.ratio.offpeak (5 default). This will help reduce the 
 average number of files per store.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4463) Run more aggressive compactions during off peak hours

2011-09-22 Thread Karthik Ranganathan (JIRA)
Run more aggressive compactions during off peak hours
-

 Key: HBASE-4463
 URL: https://issues.apache.org/jira/browse/HBASE-4463
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan


The number of iops on the disk and the top of the rack bandwidth utilization at 
off peak hours is much lower than at peak hours depending on the application 
usage pattern. We can utilize this knowledge to improve the performance of the 
HBase cluster by increasing the compact selection ratio to a much larger value 
during off-peak hours than otherwise - increasing hbase.hstore.compaction.ratio 
(1.3 default) to hbase.hstore.compaction.ratio.offpeak (5 default). This will 
help reduce the average number of files per store.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4463) Run more aggressive compactions during off peak hours

2011-09-22 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13113017#comment-13113017
 ] 

Karthik Ranganathan commented on HBASE-4463:


Initially we are going to specify a start and stop for off peak hours... a more 
automatic detection based on response latencies and data read/transferred could 
be done, but is much harder to get right.

 Run more aggressive compactions during off peak hours
 -

 Key: HBASE-4463
 URL: https://issues.apache.org/jira/browse/HBASE-4463
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 The number of iops on the disk and the top of the rack bandwidth utilization 
 at off peak hours is much lower than at peak hours depending on the 
 application usage pattern. We can utilize this knowledge to improve the 
 performance of the HBase cluster by increasing the compact selection ratio to 
 a much larger value during off-peak hours than otherwise - increasing 
 hbase.hstore.compaction.ratio (1.3 default) to 
 hbase.hstore.compaction.ratio.offpeak (5 default). This will help reduce the 
 average number of files per store.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3375) Move away from jruby; build our shell elsewise either on another foundation or build up our own

2010-12-22 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974406#action_12974406
 ] 

Karthik Ranganathan commented on HBASE-3375:


Awesome discussion. I think the only way to make scripting for HBase take off 
is to allow scripting in any language. Language lock-in for scripting takes 
away the real advantage of scripting - all the time is spent in looking up the 
syntax (unless the person writing is committed to learning the language). So in 
that sense, REST + JSON is awesome.

On a tangential note, REST+JSON also allows us to easily write HBase clients 
(that have ZK integration) in languages other than Java (aka C++). This would 
allow efficiently interacting with HBase from non Java services.

If we are agreed on the REST+JSON approach - now its only a matter of how to 
write the shell the fastest in any language. I am not familiar with where the 
REST gateway stands today, and how much work it is to move all the structures 
to JSON. If these are easy to get out the door, then we should only think about 
the fastest way to write the shell.



 Move away from jruby; build our shell elsewise either on another foundation 
 or build up our own
 ---

 Key: HBASE-3375
 URL: https://issues.apache.org/jira/browse/HBASE-3375
 Project: HBase
  Issue Type: Task
  Components: shell
Reporter: stack
 Fix For: 0.92.0


 JRuby has been sullied; its been shipping *GPL jars with a while now.  A hack 
 up to remove these jars is being done elsewhere (HBASE-3374).  This issue is 
 about casting our shell anew atop a foundation that is other than JRuby or 
 writing a shell of our own from scratch.
 JRuby has gotten us this far.  It provides a shell and it also was used 
 scripting HBase.  It would be nice if we could get scripting and shell in the 
 redo.
 Apart from the licensing issue above and that the fix will be reverting our 
 JRuby to a version that is no longer supported and that is old, other reasons 
 to move off JRuby are that while its nice having ruby to hand when scripting, 
 the JRuby complete jar is 10 or more MB in size.  Its bloated at least from 
 our small shell perspective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3375) Move away from jruby; build our shell elsewise either on another foundation or build up our own

2010-12-21 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973922#action_12973922
 ] 

Karthik Ranganathan commented on HBASE-3375:


Hey guys, like Jonathan said, I think ANTLR would be good... and once the 
framework is set, the changes are relatively easy to get in. Also, the core 
grammar does not change that much - usually only enhancements to some commands 
here and there.

Also if the META entries are all JSON (as we are increasingly moving towards) 
and we are able to expose REST API's for most of the operations, then building 
a shell in any language/framework will become trivial.





 Move away from jruby; build our shell elsewise either on another foundation 
 or build up our own
 ---

 Key: HBASE-3375
 URL: https://issues.apache.org/jira/browse/HBASE-3375
 Project: HBase
  Issue Type: Task
  Components: shell
Reporter: stack
 Fix For: 0.92.0


 JRuby has been sullied; its been shipping *GPL jars with a while now.  A hack 
 up to remove these jars is being done elsewhere (HBASE-3374).  This issue is 
 about casting our shell anew atop a foundation that is other than JRuby or 
 writing a shell of our own from scratch.
 JRuby has gotten us this far.  It provides a shell and it also was used 
 scripting HBase.  It would be nice if we could get scripting and shell in the 
 redo.
 Apart from the licensing issue above and that the fix will be reverting our 
 JRuby to a version that is no longer supported and that is old, other reasons 
 to move off JRuby are that while its nice having ruby to hand when scripting, 
 the JRuby complete jar is 10 or more MB in size.  Its bloated at least from 
 our small shell perspective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3329) HLog splitting after RS/cluster death should directly create HFiles

2010-12-10 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970288#action_12970288
 ] 

Karthik Ranganathan commented on HBASE-3329:


Yes, true. This is much more useful in the distributed log splitting context. 
My fault - forgot to add that...

 HLog splitting after RS/cluster death should directly create HFiles
 ---

 Key: HBASE-3329
 URL: https://issues.apache.org/jira/browse/HBASE-3329
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan

 After a RS dies or the cluster goes down and we are recovering, we first 
 split HLogs into the logs for the regions. Then the region servers that host 
 the regions replay the logs and open the regions.
 This can be made more efficient by directly creating HFiles from the HLogs 
 (instead of producing a split HLogs file).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3329) HLog splitting after RS/cluster death should directly create HFiles

2010-12-10 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970325#action_12970325
 ] 

Karthik Ranganathan commented on HBASE-3329:


@Ryan - didnt get that... At a higher level, I was thinking that the current 
steps are:
1. Open and read hlogs
2. Split them and create edits per region files
3. RS that opens the regions reads the split edits files and then dumps them 
into hfiles

I was thinking we could change this sequence to something like:
1. Open hlogs
2. Create hfiles for the regions

And that would give us a big gain in not writing and reading the HLogs once 
each.

 HLog splitting after RS/cluster death should directly create HFiles
 ---

 Key: HBASE-3329
 URL: https://issues.apache.org/jira/browse/HBASE-3329
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan

 After a RS dies or the cluster goes down and we are recovering, we first 
 split HLogs into the logs for the regions. Then the region servers that host 
 the regions replay the logs and open the regions.
 This can be made more efficient by directly creating HFiles from the HLogs 
 (instead of producing a split HLogs file).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3150) Allow some column to not write WALs

2010-12-10 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970339#action_12970339
 ] 

Karthik Ranganathan commented on HBASE-3150:


Yes, a co-processors based implementation would totally work.

 Allow some column to not write WALs
 ---

 Key: HBASE-3150
 URL: https://issues.apache.org/jira/browse/HBASE-3150
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan
Priority: Minor

 We have this unique requirement where some column families hold data that is 
 indexed from other existing column families. The index data is very large, 
 and we end up writing these inserts into the WAL and then into the store 
 files. In addition to taking more iops, this also slows down splitting files 
 for recovery, etc.
 Creating this task to have an option to suppress WAL logging on a per CF 
 basis.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3325) Optimize log splitter to not output obsolete edits

2010-12-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969863#action_12969863
 ] 

Karthik Ranganathan commented on HBASE-3325:


Yes +1 indeed!!

 Optimize log splitter to not output obsolete edits
 --

 Key: HBASE-3325
 URL: https://issues.apache.org/jira/browse/HBASE-3325
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon

 Currently when the master splits logs, it outputs all edits it finds, even 
 those that have already been obsoleted by flushes. At replay time on the RS 
 we discard the edits that have already been flushed.
 We could do a pretty simple optimization here - basically the RS should 
 replicate a map region id - last flushed seq id into ZooKeeper (this can 
 be asynchronous by some seconds without any problems). Then when doing log 
 splitting, if we have this map available, we can discard any edits found in 
 the logs that were already flushed, and thus output a much smaller amount of 
 data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

2010-12-09 Thread Karthik Ranganathan (JIRA)
For increment workloads, retain memstores in memory after flushing them
---

 Key: HBASE-3327
 URL: https://issues.apache.org/jira/browse/HBASE-3327
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan


This is an improvement based on our observation of what happens in an increment 
workload. The working set is typically small and is contained in the memstores. 
1. The reason the memstores get flushed is because the number of wal logs limit 
gets hit. 
2. This in turn triggers compactions, which evicts the block cache. 
3. Flushing of memstore and eviction of the block cache causes disk reads for 
increments coming in after this because the data is no longer in memory.

We could solve this elegantly by retaining the memstores AFTER they are flushed 
into files. This would mean we can quickly populate the new memstore with the 
working set of data from memory itself without having to hit disk. We can 
throttle the number of such memstores we retain, or the memory allocated to it. 
In fact, allocating a percentage of the block cache to this would give us a 
huge boost.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

2010-12-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969892#action_12969892
 ] 

Karthik Ranganathan commented on HBASE-3327:


True - I mentioned HLog limit because we observed it because of that, but this 
would address the underlying issue for any of the reasons to flush. 
Additionally, this also makes it resilient in the face of compactions, which 
HLog compactions would not help with. 

HLog compactions would also be most effective for the ICV kind of workload 
(frequent updates to existing data) right?


 For increment workloads, retain memstores in memory after flushing them
 ---

 Key: HBASE-3327
 URL: https://issues.apache.org/jira/browse/HBASE-3327
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan

 This is an improvement based on our observation of what happens in an 
 increment workload. The working set is typically small and is contained in 
 the memstores. 
 1. The reason the memstores get flushed is because the number of wal logs 
 limit gets hit. 
 2. This in turn triggers compactions, which evicts the block cache. 
 3. Flushing of memstore and eviction of the block cache causes disk reads for 
 increments coming in after this because the data is no longer in memory.
 We could solve this elegantly by retaining the memstores AFTER they are 
 flushed into files. This would mean we can quickly populate the new memstore 
 with the working set of data from memory itself without having to hit disk. 
 We can throttle the number of such memstores we retain, or the memory 
 allocated to it. In fact, allocating a percentage of the block cache to this 
 would give us a huge boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

2010-12-09 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970047#action_12970047
 ] 

Karthik Ranganathan commented on HBASE-3327:


Ryan: was talking to Kannan as well about this. The only thing the writing into 
block cache on flushes works for flushes. But for compactions, it gets a bit 
complicated - and any algorithm will become a little dependent on the 
compaction policy.

 For increment workloads, retain memstores in memory after flushing them
 ---

 Key: HBASE-3327
 URL: https://issues.apache.org/jira/browse/HBASE-3327
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan

 This is an improvement based on our observation of what happens in an 
 increment workload. The working set is typically small and is contained in 
 the memstores. 
 1. The reason the memstores get flushed is because the number of wal logs 
 limit gets hit. 
 2. This in turn triggers compactions, which evicts the block cache. 
 3. Flushing of memstore and eviction of the block cache causes disk reads for 
 increments coming in after this because the data is no longer in memory.
 We could solve this elegantly by retaining the memstores AFTER they are 
 flushed into files. This would mean we can quickly populate the new memstore 
 with the working set of data from memory itself without having to hit disk. 
 We can throttle the number of such memstores we retain, or the memory 
 allocated to it. In fact, allocating a percentage of the block cache to this 
 would give us a huge boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3329) HLog splitting after RS/cluster death should directly create HFiles

2010-12-09 Thread Karthik Ranganathan (JIRA)
HLog splitting after RS/cluster death should directly create HFiles
---

 Key: HBASE-3329
 URL: https://issues.apache.org/jira/browse/HBASE-3329
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Karthik Ranganathan


After a RS dies or the cluster goes down and we are recovering, we first split 
HLogs into the logs for the regions. Then the region servers that host the 
regions replay the logs and open the regions.

This can be made more efficient by directly creating HFiles from the HLogs 
(instead of producing a split HLogs file).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3156) Special case distributed log splitting on fresh cluster startup

2010-10-26 Thread Karthik Ranganathan (JIRA)
Special case distributed log splitting on fresh cluster startup
---

 Key: HBASE-3156
 URL: https://issues.apache.org/jira/browse/HBASE-3156
 Project: HBase
  Issue Type: New Feature
Reporter: Karthik Ranganathan


If the entire HBase goes down (not a graceful stop - example namenode dies) 
then on a subsequent restart, the HMaster can hand off the hlog splitting to 
the respective region servers. This would parallelize the log splitting and 
maintain region server hfile locality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HBASE-3156) Special case distributed log splitting on fresh cluster startup

2010-10-26 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-3156:
--

Assignee: Karthik Ranganathan

 Special case distributed log splitting on fresh cluster startup
 ---

 Key: HBASE-3156
 URL: https://issues.apache.org/jira/browse/HBASE-3156
 Project: HBase
  Issue Type: New Feature
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 If the entire HBase goes down (not a graceful stop - example namenode dies) 
 then on a subsequent restart, the HMaster can hand off the hlog splitting to 
 the respective region servers. This would parallelize the log splitting and 
 maintain region server hfile locality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3149) Make flush decisions per column family

2010-10-26 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925090#action_12925090
 ] 

Karthik Ranganathan commented on HBASE-3149:


Yes, agreed that the memory implication is different. 

Eventually, is it not better to enforce the memory limit by using a combination 
of flush sizes and restricting the number of regions we create? Because ideally 
we should allow different flush sizes for the different CF's as the KV sizes 
could be way different...

Shall I just make this an option in the conf for now with the default the way 
it is?

 Make flush decisions per column family
 --

 Key: HBASE-3149
 URL: https://issues.apache.org/jira/browse/HBASE-3149
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan

 Today, the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3156) Special case distributed log splitting on fresh cluster startup

2010-10-26 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925150#action_12925150
 ] 

Karthik Ranganathan commented on HBASE-3156:


Awesome! I was thinking of doing this without MR if possible - since each RS 
would replay the all HLogs in a directory, there is no need to split files and 
then replay the logs...

 Special case distributed log splitting on fresh cluster startup
 ---

 Key: HBASE-3156
 URL: https://issues.apache.org/jira/browse/HBASE-3156
 Project: HBase
  Issue Type: New Feature
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

 If the entire HBase goes down (not a graceful stop - example namenode dies) 
 then on a subsequent restart, the HMaster can hand off the hlog splitting to 
 the respective region servers. This would parallelize the log splitting and 
 maintain region server hfile locality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3149) Make flush decisions per column family

2010-10-25 Thread Karthik Ranganathan (JIRA)
Make flush decisions per column family
--

 Key: HBASE-3149
 URL: https://issues.apache.org/jira/browse/HBASE-3149
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan


Today, the flush decision is made using the aggregate size of all column 
families. When large and small column families co-exist, this causes many small 
flushes of the smaller CF. We need to make per-CF flush decisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3150) Allow some column to not write WALs

2010-10-25 Thread Karthik Ranganathan (JIRA)
Allow some column to not write WALs
---

 Key: HBASE-3150
 URL: https://issues.apache.org/jira/browse/HBASE-3150
 Project: HBase
  Issue Type: Improvement
Reporter: Karthik Ranganathan
Priority: Minor


We have this unique requirement where some column families hold data that is 
indexed from other existing column families. The index data is very large, and 
we end up writing these inserts into the WAL and then into the store files. In 
addition to taking more iops, this also slows down splitting files for 
recovery, etc.

Creating this task to have an option to suppress WAL logging on a per CF basis.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-2931) Do not throw RuntimeExceptions in RPC/HbaseObjectWritable code, ensure we log and rethrow as IOE

2010-08-22 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-2931:
---

Attachment: HBASE-2931.patch

Simple patch - posting directly instead of review board.

Moves the newly added class down to the end of the object writeable opcode list 
so that all subsequent op-codes do not change. Also added some logging.

 Do not throw RuntimeExceptions in RPC/HbaseObjectWritable code, ensure we log 
 and rethrow as IOE
 

 Key: HBASE-2931
 URL: https://issues.apache.org/jira/browse/HBASE-2931
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Gray
Priority: Critical
 Fix For: 0.90.0

 Attachments: HBASE-2931.patch


 When there are issues with RPC and HbaseObjectWritable, primarily when server 
 and client have different jars, the only thing that happens is the client 
 will receive an EOF exception.  The server does not log what happened at all 
 and the client does not receive a server trace, rather the server seems to 
 close the connection and the client gets an EOF because it tries to read off 
 of a closed stream.
 We need to ensure that we catch, log, and rethrow as IOE any exceptions that 
 may occur because of an issue with RPC or HbaseObjectWritable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HBASE-2931) Do not throw RuntimeExceptions in RPC/HbaseObjectWritable code, ensure we log and rethrow as IOE

2010-08-22 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-2931:
--

Assignee: Karthik Ranganathan

 Do not throw RuntimeExceptions in RPC/HbaseObjectWritable code, ensure we log 
 and rethrow as IOE
 

 Key: HBASE-2931
 URL: https://issues.apache.org/jira/browse/HBASE-2931
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Gray
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.90.0

 Attachments: HBASE-2931.patch


 When there are issues with RPC and HbaseObjectWritable, primarily when server 
 and client have different jars, the only thing that happens is the client 
 will receive an EOF exception.  The server does not log what happened at all 
 and the client does not receive a server trace, rather the server seems to 
 close the connection and the client gets an EOF because it tries to read off 
 of a closed stream.
 We need to ensure that we catch, log, and rethrow as IOE any exceptions that 
 may occur because of an issue with RPC or HbaseObjectWritable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2812) Disable 'table' fails to complete frustrating my ability to test easily

2010-08-10 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897082#action_12897082
 ] 

Karthik Ranganathan commented on HBASE-2812:


+1 Patch looks good to me.

 Disable 'table' fails to complete frustrating my ability to test easily
 ---

 Key: HBASE-2812
 URL: https://issues.apache.org/jira/browse/HBASE-2812
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.20.6, 0.89.20100621
 Environment: 0.89, non-distributed mode
Reporter: Sam Pullara
 Fix For: 0.20.7, 0.90.0

 Attachments: HBASE-2812.patch


 I see this in the client after it gives up:
 hbase(main):006:0 disable 'test_schema'
 ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took 
 too long to wait for the table test_schema to be disabled.
 Here is some help for this command:
   Disable the named table: e.g. hbase disable 't1'
 and this in the server log, a set of about 5 reports it is closing per 
 disable call:
 2010-07-03 15:19:47,554 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
 2010-07-03 15:19:47,554 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing regions currently 
 being served
 2010-07-03 15:19:47,555 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Adding region 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081. to setClosing 
 list
 2010-07-03 15:19:47,576 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:47,576 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:48,567 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
 2010-07-03 15:19:48,567 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing regions currently 
 being served
 2010-07-03 15:19:48,568 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Adding region 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081. to setClosing 
 list
 2010-07-03 15:19:48,577 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:48,578 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:49,580 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
 2010-07-03 15:19:49,580 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing regions currently 
 being served
 2010-07-03 15:19:49,581 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Adding region 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081. to setClosing 
 list
 2010-07-03 15:19:50,580 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:50,581 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:50,592 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
 2010-07-03 15:19:50,592 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing regions currently 
 being served
 2010-07-03 15:19:50,593 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Adding region 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081. to setClosing 
 list
 2010-07-03 15:19:51,581 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:51,581 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_CLOSE: 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081.
 2010-07-03 15:19:52,605 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
 2010-07-03 15:19:52,605 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Processing regions currently 
 being served
 2010-07-03 15:19:52,606 DEBUG 
 org.apache.hadoop.hbase.master.ChangeTableState: Adding region 
 test_schema,,1278195322074.65c77aedf2f2a08d161a188dd2dd5081. to setClosing 
 list
 2010-07-03 15:19:52,703 INFO org.apache.hadoop.hbase.master.ServerManager: 1 
 region servers, 0 dead, average load 3.0
 2010-07-03 15:19:52,863 INFO org.apache.hadoop.hbase.master.BaseScanner: 
 RegionManager.rootScanner scanning meta region {server: 192.168.2.1:54389, 
 regionname: -ROOT-,,0.70236052, startKey: }
 

[jira] Commented: (HBASE-2866) Region permanently offlined

2010-07-23 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891649#action_12891649
 ] 

Karthik Ranganathan commented on HBASE-2866:


Hey Stack,

Have a fix ready - testing now, will put it up in a bit.

Fix is simple: we get into this situation because we update the same region in 
transition in ZK again and again, which bumps up the revision number of the 
ZNode. This causes the update to fail. So if the ZNode is already in the target 
state, do not update it again.

The above explanation is super-cryptic :), so will sync up with you on the 
issue and the fix.

 Region permanently offlined 
 

 Key: HBASE-2866
 URL: https://issues.apache.org/jira/browse/HBASE-2866
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Karthik Ranganathan
Priority: Blocker
 Attachments: master.log


 After split, master attempts to reassign a region to a region server. 
 Occasionally, such a region can get permanently offlined.
 Master:
 -
 {code}
 2010-07-22 01:26:00,914 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Processing MSG_REPORT_SPLIT_INCLUDES_DAUGHTERS: 
 test1,651220,1279784117114.6466481aa931f8c1fa87622735487a72.: Daughters; 
 test1,651220,1279787158624.6ead25ae677116cc88fc5420bb39d52e., 
 test1,653179,1279787\
 158624.8d5490bfc166c687657cb09203bd7d44. from 
 test024.test.xyz.com,60020,1279780567744; 1 of 1  
   
   

 2010-07-22 01:26:00,935 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 in state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,935 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 in state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,945 INFO org.apache.hadoop.hbase.master.RegionManager: 
 Assigning region 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44. to 
 test024.test.xyz.com,60020,1279780567744
 2010-07-22 01:26:00,949 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While updating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 exists, state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,954 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created UNASSIGNED zNode 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44. in state 
 M2ZK_REGION_OFFLINE
 {code}
 ---
 Region Server:
 {code}
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44.
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: 
 test1,651220,1279787158624.6ead25ae677116cc88fc5420bb39d52e.
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN: 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44.
 2010-07-22 01:26:00,948 DEBUG 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: Updating ZNode 
 /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44 with 
 [RS2ZK_REGION_OPENING] expected version = 0
 2010-07-22 01:26:00,952 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, 
 state: SyncConnected, type: NodeDataChanged, path: 
 /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44
 2010-07-22 01:26:00,974 WARN 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
 msgstorectrl001.test.xyz.com,msgstorectrl021.test.xyz.com,msgstorectrl041.test.xyz.com,msgstorectrl061.test.xyz.com,msgstorectrl081.ash2.facebook\
 .com:/hbase,test024.test.xyz.com,60020,1279780567744Failed to write data to 
 ZooKeeper
 org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
 BadVersion for /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:106)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1038)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeZNode(ZooKeeperWrapper.java:1062)
 at 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater.updateZKWithEventData(RSZookeeperUpdater.java:161)
 at 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater.startRegionOpenEvent(RSZookeeperUpdater.java:115)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1428)
 at 
 

[jira] Commented: (HBASE-2866) Region permanently offlined

2010-07-23 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891804#action_12891804
 ] 

Karthik Ranganathan commented on HBASE-2866:



Stack - just uploaded a review at http://review.hbase.org/r/380/

 Region permanently offlined 
 

 Key: HBASE-2866
 URL: https://issues.apache.org/jira/browse/HBASE-2866
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Karthik Ranganathan
Priority: Blocker
 Attachments: master.log


 After split, master attempts to reassign a region to a region server. 
 Occasionally, such a region can get permanently offlined.
 Master:
 -
 {code}
 2010-07-22 01:26:00,914 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Processing MSG_REPORT_SPLIT_INCLUDES_DAUGHTERS: 
 test1,651220,1279784117114.6466481aa931f8c1fa87622735487a72.: Daughters; 
 test1,651220,1279787158624.6ead25ae677116cc88fc5420bb39d52e., 
 test1,653179,1279787\
 158624.8d5490bfc166c687657cb09203bd7d44. from 
 test024.test.xyz.com,60020,1279780567744; 1 of 1  
   
   

 2010-07-22 01:26:00,935 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 in state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,935 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 in state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,945 INFO org.apache.hadoop.hbase.master.RegionManager: 
 Assigning region 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44. to 
 test024.test.xyz.com,60020,1279780567744
 2010-07-22 01:26:00,949 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While updating UNASSIGNED 
 region 8d5490bfc166c687657cb09203bd7d44 exists, state = M2ZK_REGION_OFFLINE
 2010-07-22 01:26:00,954 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created UNASSIGNED zNode 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44. in state 
 M2ZK_REGION_OFFLINE
 {code}
 ---
 Region Server:
 {code}
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44.
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: 
 test1,651220,1279787158624.6ead25ae677116cc88fc5420bb39d52e.
 2010-07-22 01:26:00,947 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN: 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44.
 2010-07-22 01:26:00,948 DEBUG 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: Updating ZNode 
 /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44 with 
 [RS2ZK_REGION_OPENING] expected version = 0
 2010-07-22 01:26:00,952 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, 
 state: SyncConnected, type: NodeDataChanged, path: 
 /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44
 2010-07-22 01:26:00,974 WARN 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
 msgstorectrl001.test.xyz.com,msgstorectrl021.test.xyz.com,msgstorectrl041.test.xyz.com,msgstorectrl061.test.xyz.com,msgstorectrl081.ash2.facebook\
 .com:/hbase,test024.test.xyz.com,60020,1279780567744Failed to write data to 
 ZooKeeper
 org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
 BadVersion for /hbase/UNASSIGNED/8d5490bfc166c687657cb09203bd7d44
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:106)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1038)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeZNode(ZooKeeperWrapper.java:1062)
 at 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater.updateZKWithEventData(RSZookeeperUpdater.java:161)
 at 
 org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater.startRegionOpenEvent(RSZookeeperUpdater.java:115)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1428)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1337)
 at java.lang.Thread.run(Thread.java:619)
 2010-07-22 01:26:00,975 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening 
 test1,653179,1279787158624.8d5490bfc166c687657cb09203bd7d44.
 java.io.IOException: 
 org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
 BadVersion for 

[jira] Created: (HBASE-2872) Investigate why regions in transition are updated to the same state multiple times

2010-07-23 Thread Karthik Ranganathan (JIRA)
Investigate why regions in transition are updated to the same state multiple 
times
--

 Key: HBASE-2872
 URL: https://issues.apache.org/jira/browse/HBASE-2872
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Karthik Ranganathan


This is related to HBASE-2866 Regions going permanently offline. The fix 
prevented multiple duplicate updates from going to ZK. But the master still 
tries to update these regions.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-2833) Write a unit test for HBASE-2781

2010-07-13 Thread Karthik Ranganathan (JIRA)
Write a unit test for HBASE-2781


 Key: HBASE-2833
 URL: https://issues.apache.org/jira/browse/HBASE-2833
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.89.20100621
Reporter: Karthik Ranganathan


Need a test case to verify the fix for HBASE-2781 ZKW.createUnassignedRegion 
doesn't make sure existing znode is in the right state

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-2781) ZKW.createUnassignedRegion doesn't make sure existing znode is in the right state

2010-07-13 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-2781:
---

Attachment: HBASE-2781-0.21.patch

Adding the fix here, I will open a separate JIRA for adding a test case for 
this issue.

 ZKW.createUnassignedRegion doesn't make sure existing znode is in the right 
 state
 -

 Key: HBASE-2781
 URL: https://issues.apache.org/jira/browse/HBASE-2781
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.90.0

 Attachments: HBASE-2781-0.21.patch


 In ZKW.createUnassignedRegion I see this comment:
 {code}
   // check if this node already exists - 
   //   - it should not exist
   //   - if it does, it should be in the CLOSED state
 {code}
 And what I got is:
 {noformat}
 2010-06-23 15:42:05,823 INFO  [IPC Server handler 3 on 60362] 
 master.ServerManager(457): Processing MSG_REPORT_PROCESS_OPEN: 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. from 
 h136.sfo.stumble.net,60365,1277332849712; 1 of 4
 2010-06-23 15:42:05,867 INFO  [RegionServer:1.worker] 
 regionserver.HRegionServer$Worker(1338): Worker: MSG_REGION_OPEN: 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:05,870 DEBUG [RegionServer:1.worker] 
 regionserver.RSZookeeperUpdater(157): Updating ZNode 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464 with [RS2ZK_REGION_OPENING] 
 expected version = 0
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] master.HMaster(1158): Event 
 NodeDataChanged with state SyncConnected with path 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] 
 master.ZKMasterAddressWatcher(64): Got event NodeDataChanged with path 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] 
 master.ZKUnassignedWatcher(95): ZK-EVENT-PROCESS: Got zkEvent NodeDataChanged 
 state:SyncConnected path:/1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,872 INFO  [main-EventThread] 
 regionserver.HRegionServer(379): Got ZooKeeper event, state: SyncConnected, 
 type: NodeDataChanged, path: /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,872 DEBUG [MASTER_OPENREGION-10.10.1.136:60362-1] 
 handler.MasterOpenRegionHandler(77): Event = RS2ZK_REGION_OPENING, region = 
 13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,874 DEBUG [RegionServer:1.worker] 
 regionserver.HRegion(297): Creating region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,154 INFO  [RegionServer:1.worker] 
 regionserver.HRegion(366): Onlined 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.; next sequenceid=1
 2010-06-23 15:42:06,154 DEBUG [RegionServer:1.worker] 
 regionserver.RSZookeeperUpdater(157): Updating ZNode 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464 with [RS2ZK_REGION_OPENED] 
 expected version = 1\
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:06,249 ERROR [RegionServer:1.worker] 
 regionserver.HRegionServer(1488): Failed to mark region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. as opened
 java.io.IOException: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
 KeeperErrorCode = ConnectionLoss for 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] 
 regionserver.HRegionServer(1569): closing region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(487): 
 Closing test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.: disabling 
 compactions  flushes
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(512): 
 Updates disabled for region, no outstanding scanners on 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(519): No 
 more row locks outstanding on region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,994 INFO  [RegionServer:1] regionserver.HRegion(531): 
 Closed test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:09,105 INFO  [master] master.ProcessServerShutdown(126): 
 Region test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. was in 
 transition 
 name=test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464., 
 state=PENDING_OPEN on dead server 

[jira] Commented: (HBASE-2781) ZKW.createUnassignedRegion doesn't make sure existing znode is in the right state

2010-06-28 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883239#action_12883239
 ] 

Karthik Ranganathan commented on HBASE-2781:


Just wanted to update - working on the test case for this, will upload patch 
along with the JUnit test.

 ZKW.createUnassignedRegion doesn't make sure existing znode is in the right 
 state
 -

 Key: HBASE-2781
 URL: https://issues.apache.org/jira/browse/HBASE-2781
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.21.0


 In ZKW.createUnassignedRegion I see this comment:
 {code}
   // check if this node already exists - 
   //   - it should not exist
   //   - if it does, it should be in the CLOSED state
 {code}
 And what I got is:
 {noformat}
 2010-06-23 15:42:05,823 INFO  [IPC Server handler 3 on 60362] 
 master.ServerManager(457): Processing MSG_REPORT_PROCESS_OPEN: 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. from 
 h136.sfo.stumble.net,60365,1277332849712; 1 of 4
 2010-06-23 15:42:05,867 INFO  [RegionServer:1.worker] 
 regionserver.HRegionServer$Worker(1338): Worker: MSG_REGION_OPEN: 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:05,870 DEBUG [RegionServer:1.worker] 
 regionserver.RSZookeeperUpdater(157): Updating ZNode 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464 with [RS2ZK_REGION_OPENING] 
 expected version = 0
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] master.HMaster(1158): Event 
 NodeDataChanged with state SyncConnected with path 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] 
 master.ZKMasterAddressWatcher(64): Got event NodeDataChanged with path 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,871 DEBUG [main-EventThread] 
 master.ZKUnassignedWatcher(95): ZK-EVENT-PROCESS: Got zkEvent NodeDataChanged 
 state:SyncConnected path:/1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,872 INFO  [main-EventThread] 
 regionserver.HRegionServer(379): Got ZooKeeper event, state: SyncConnected, 
 type: NodeDataChanged, path: /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,872 DEBUG [MASTER_OPENREGION-10.10.1.136:60362-1] 
 handler.MasterOpenRegionHandler(77): Event = RS2ZK_REGION_OPENING, region = 
 13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:05,874 DEBUG [RegionServer:1.worker] 
 regionserver.HRegion(297): Creating region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,154 INFO  [RegionServer:1.worker] 
 regionserver.HRegion(366): Onlined 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.; next sequenceid=1
 2010-06-23 15:42:06,154 DEBUG [RegionServer:1.worker] 
 regionserver.RSZookeeperUpdater(157): Updating ZNode 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464 with [RS2ZK_REGION_OPENED] 
 expected version = 1\
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:06,249 ERROR [RegionServer:1.worker] 
 regionserver.HRegionServer(1488): Failed to mark region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. as opened
 java.io.IOException: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
 KeeperErrorCode = ConnectionLoss for 
 /1/UNASSIGNED/13bef4950ac6827ac32d87682b8b2464
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] 
 regionserver.HRegionServer(1569): closing region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(487): 
 Closing test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.: disabling 
 compactions  flushes
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(512): 
 Updates disabled for region, no outstanding scanners on 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,993 DEBUG [RegionServer:1] regionserver.HRegion(519): No 
 more row locks outstanding on region 
 test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:06,994 INFO  [RegionServer:1] regionserver.HRegion(531): 
 Closed test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464.
 2010-06-23 15:42:09,105 INFO  [master] master.ProcessServerShutdown(126): 
 Region test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464. was in 
 transition 
 name=test,lll,1277332918248.13bef4950ac6827ac32d87682b8b2464., 
 state=PENDING_OPEN on dead server 

[jira] Updated: (HBASE-2737) CME in ZKW introduced in HBASE-2694

2010-06-18 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-2737:
---

Attachment: HBASE-2737-0.21.patch

Making the register and unregister methods synchronized. Unit tests are 
passing. This change is so simple I am not putting it up on review board.

 CME in ZKW introduced in HBASE-2694
 ---

 Key: HBASE-2737
 URL: https://issues.apache.org/jira/browse/HBASE-2737
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Karthik Ranganathan
 Fix For: 0.21.0

 Attachments: HBASE-2737-0.21.patch


 Saw this while tail'ing a log for something else:
 {code}
 2010-06-15 17:30:03,769 ERROR [main-EventThread] 
 zookeeper.ClientCnxn$EventThread(490): Error while calling watcher
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.process(ZooKeeperWrapper.java:235)
 {code}
 Looks like the listeners list's iterator is used in an unprotected manner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HBASE-2695) HMaster cleanup and refactor

2010-06-16 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan reassigned HBASE-2695:
--

Assignee: Karthik Ranganathan

 HMaster cleanup and refactor
 

 Key: HBASE-2695
 URL: https://issues.apache.org/jira/browse/HBASE-2695
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Jonathan Gray
Assignee: Karthik Ranganathan
Priority: Critical
 Fix For: 0.21.0


 Before doing the more significant changes to HMaster, it would benefit 
 greatly from some cleanup, commenting, and a bit of refactoring.
 One motivation is to nail down the initialization flow and comment each step. 
  Another is to add a couple new classes to break up functionality into 
 helpers to reduce HMaster size (for example, pushing all filesystem 
 operations into their own class).  And lastly to stop the practice of passing 
 around references to HMaster everywhere and instead pass along only what is 
 necessary.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-2694) Move RS to Master region open/close messaging into ZooKeeper

2010-06-11 Thread Karthik Ranganathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Ranganathan updated HBASE-2694:
---

Attachment: HBASE-2694-OPENSOURCE-TRUNK-zk-based-messaging-v2.patch

Second pass at the patch. Incorporates changes from in-person review with Todd 
and Stack. Unit tests pass.

 Move RS to Master region open/close messaging into ZooKeeper
 

 Key: HBASE-2694
 URL: https://issues.apache.org/jira/browse/HBASE-2694
 Project: HBase
  Issue Type: Sub-task
  Components: master, regionserver
Reporter: Jonathan Gray
Priority: Critical
 Fix For: 0.21.0

 Attachments: HBASE-2694-OPENSOURCE-TRUNK-zk-based-messaging-v2.patch, 
 HBASE-2694-OPENSOURCE-TRUNK-zk-based-messaging.patch


 As a first step towards HBASE-2485, this issue is about changing the message 
 flow of opening and closing of regions without actually changing the 
 implementation of what happens on both the Master and RegionServer sides.  
 This way we can debug the messaging changes before the introduction of more 
 significant changes to the master architecture and handling of regions in 
 transition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.