date:20120107


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181911#comment-13181911
 ] 

Hudson commented on HBASE-5088:
---

Integrated in HBase-0.92-security #65 (See 
[https://builds.apache.org/job/HBase-0.92-security/65/])
HBASE-5088 A concurrency issue on SoftValueSortedMap (Jieshan Bean and Lars 
H)

larsh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/SoftValueSortedMap.java


 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final.txt, 5088-final2.txt, 5088-final3.txt, 
 5088-syncObj.txt, 5088-useMapInterfaces.txt, 5088.generics.txt, 
 HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4357) Region stayed in transition - in closing state


[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181910#comment-13181910
 ] 

Hudson commented on HBASE-4357:
---

Integrated in HBase-0.92-security #65 (See 
[https://builds.apache.org/job/HBase-0.92-security/65/])
HBASE-4357  Region stayed in transition - in closing state (Ming Ma)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseMetaHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRootHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java


 Region stayed in transition - in closing state
 --

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: 4357.txt, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error

2012-01-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181909#comment-13181909
 ] 

Hudson commented on HBASE-5041:
---

Integrated in HBase-0.92-security #65 (See 
[https://builds.apache.org/job/HBase-0.92-security/65/])
HBASE-5041  Major compaction on non existing table does not throw error 
(Shrijeet)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java


 Major compaction on non existing table does not throw error 
 

 Key: HBASE-5041
 URL: https://issues.apache.org/jira/browse/HBASE-5041
 Project: HBase
  Issue Type: Bug
  Components: regionserver, shell
Affects Versions: 0.90.3
Reporter: Shrijeet Paliwal
Assignee: Shrijeet Paliwal
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 
 0002-HBASE-5041-Throw-error-if-table-does-not-exist.patch, 
 0002-HBASE-5041-Throw-error-if-table-does-not-exist.patch, 
 0003-HBASE-5041-Throw-error-if-table-does-not-exist.0.90.patch


 Following will not complain even if fubar does not exist
 {code}
 echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell
 {code}
 The downside for this defect is that major compaction may be skipped due to
 a typo by Ops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181915#comment-13181915
 ] 

ramkrishna.s.vasudevan commented on HBASE-5088:
---

@Lars
I too feel we can backport to 0.90? 

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final.txt, 5088-final2.txt, 5088-final3.txt, 
 5088-syncObj.txt, 5088-useMapInterfaces.txt, 5088.generics.txt, 
 HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-07 Thread ramkrishna.s.vasudevan (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5137:
--

Attachment: HBASE-5137.patch

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181963#comment-13181963
 ] 

ramkrishna.s.vasudevan commented on HBASE-5137:
---

Patch for 0.90.

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182014#comment-13182014
 ] 

Zhihong Yu commented on HBASE-5137:
---

Minor comment:
{code}
+if (checkFileSystem()  retrySplitting)
+  LOG.info(Retrying failed log splitting  + logDir.toString());
+else {
{code}
Please add braces around the log statement.
I think the above check should go into TRUNK as well (aborting in the case of 
not retrying).

Should we also handle InterruptedException, as TRUNK does ?

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182038#comment-13182038
 ] 

jirapos...@reviews.apache.org commented on HBASE-4224:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3308/#review4236
---



/src/main/java/org/apache/hadoop/hbase/ServerName.java
https://reviews.apache.org/r/3308/#comment9579

Please enclose the assignment on line 292 in curly braces.



/src/main/java/org/apache/hadoop/hbase/ServerName.java
https://reviews.apache.org/r/3308/#comment9580

Since IPv4 support is built in, I suggest naming this method isValidHost.



/src/main/java/org/apache/hadoop/hbase/ServerName.java
https://reviews.apache.org/r/3308/#comment9581

Why the extra line here ?



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9587

I think threadPool is a better name for this field.



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9582

I think Async is unnecessary here - that's what threads provide.



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9584

Why not call tableExists() directly ?



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9583

Should include the actual name passed in exception message.



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9585

Should read 'Exception parsing server name'



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9586

Since serverToRegionsMap has been created, you can return 
serverToRegionsMap here.



/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
https://reviews.apache.org/r/3308/#comment9588

regions could be empty, right ?



/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
https://reviews.apache.org/r/3308/#comment9589

This and flushAllRegions() are similar.
Can we introduce just one new method which checks whether the list is empty 
to decide what to do ?
i.e. move the check @ line 1403 of HBaseAdmin to the implementation of the 
new method.



/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
https://reviews.apache.org/r/3308/#comment9590

Curly braces, please.



/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
https://reviews.apache.org/r/3308/#comment9591

Do we need to place a try/catch block around line 2795 ?
Currently the first failure would stop subsequent flushes.


- Ted


On 2012-01-06 18:48:11, Akash  Ashok wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3308/
bq.  ---
bq.  
bq.  (Updated 2012-01-06 18:48:11)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Flush by RegionServer
bq.  
bq.  
bq.  This addresses bug HBase-4224.
bq.  https://issues.apache.org/jira/browse/HBase-4224
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1226330 
bq./src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330 
bq./src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330 
bq.  
bq.  Diff: https://reviews.apache.org/r/3308/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Akash
bq.  
bq.



 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2012-01-07 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182046#comment-13182046
 ] 

Lars Hofhansl commented on HBASE-5088:
--

OK. Sorry for presuming. I'll commit to 0.90 later today.




 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final.txt, 5088-final2.txt, 5088-final3.txt, 
 5088-syncObj.txt, 5088-useMapInterfaces.txt, 5088.generics.txt, 
 HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Jean-Daniel Cryans (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-5141:
--

Assignee: Jean-Daniel Cryans
  Status: Patch Available  (was: Open)

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Jean-Daniel Cryans (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Daniel Cryans updated HBASE-5141:
--

Attachment: HBASE-5141-v2.patch

This second patch survives my little test. What I was missing was that the
packet also contains a reference to the params, so I have to clear out both
(that was a bit confusing).

Memory leak in MonitoredRPCHandlerImpl
--

Key: HBASE-5141
URL: https://issues.apache.org/jira/browse/HBASE-5141
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.92.0, 0.94.0

Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot
2012-01-06 at 3.03.09 PM.png

I got a pretty reliable way of OOME'ing my region servers. Using a big
payload (64MB in my case), a default heap and default number of handlers,
it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB
reference and once a compaction kicks in it kills everything.
The issue is that even after the RPC call is done, the packet still lives in
MonitoredRPCHandlerImpl.
Will attach a screen shot of jprofiler's analysis in a moment.
This is a blocker for 0.92.0, anyone using a high number of handlers and
bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182060#comment-13182060
 ] 

ramkrishna.s.vasudevan commented on HBASE-5137:
---

@Ted
In trunk we sleep for a configured time and hence we handle 
InterruptedException.  But i think that is also not needed as in trunk once we 
know file system is not available we do Runtime.halt(). If the file system is 
available why do we need to sleep for some time and then retry.


 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Hadoop QA (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182070#comment-13182070
]

Hadoop QA commented on HBASE-5141:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12509797/HBASE-5141-v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -151 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 79 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/696//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/696//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/696//console

This message is automatically generated.

Memory leak in MonitoredRPCHandlerImpl
--

Key: HBASE-5141
URL: https://issues.apache.org/jira/browse/HBASE-5141
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.92.0, 0.94.0

Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot
2012-01-06 at 3.03.09 PM.png

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182103#comment-13182103
 ] 

stack commented on HBASE-5141:
--

+1 on the patch.  Small.

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException


[ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182111#comment-13182111
 ] 

Zhihong Yu commented on HBASE-5137:
---

Nicolas might know the reason for introducing 
hbase.hlog.split.failure.retry.interval parameter

Please provide a patch for 0.92 and TRUNK which adds check for retrySplitting 
in the following if statement (line 220):
{code}
if (!checkFileSystem()) {
{code}

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4357) Region stayed in transition - in closing state

2012-01-07 Thread Zhihong Yu (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182114#comment-13182114
 ] 

stack commented on HBASE-4357:
--

+1 Nice patch Ming.

Regards conversation above on what if HRS can't close a region, I'd say lets go 
basic for now and crash out the HRS and let ServerShutdownHandler make sense of 
it all.  Not the TimeoutMonitor IMO.  TM at the moment is way too heavy-handed. 
 Needs to be made more of a butterfly than bulldozer before we let it do 
closing fixups.

Good stuff.

 Region stayed in transition - in closing state
 --

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: 4357.txt, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException


[ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182111#comment-13182111
 ] 

Zhihong Yu edited comment on HBASE-5137 at 1/7/12 10:04 PM:


Nicolas might know the reason for introducing 
hbase.hlog.split.failure.retry.interval parameter

  was (Author: zhi...@ebaysf.com):
Nicolas might know the reason for introducing 
hbase.hlog.split.failure.retry.interval parameter

Please provide a patch for 0.92 and TRUNK which adds check for retrySplitting 
in the following if statement (line 220):
{code}
if (!checkFileSystem()) {
{code}
  
 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: 5137-trunk.txt, HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-07 Thread Zhihong Yu (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5137:
--

Attachment: 5137-trunk.txt

Suggested patch for TRUNK.

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: 5137-trunk.txt, HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5138) [ref manual] Add a discussion on the number of regions

[
https://issues.apache.org/jira/browse/HBASE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182117#comment-13182117
]

stack commented on HBASE-5138:
--

+1 on nice doc.

[ref manual] Add a discussion on the number of regions
--

Key: HBASE-5138
URL: https://issues.apache.org/jira/browse/HBASE-5138
Project: HBase
Issue Type: Task
Reporter: Jean-Daniel Cryans

ntelford on IRC made the good point that we say people shouldn't have too
many regions, but we don't say why. His problem currently is:
{quote}
09:21 ntelford problem is, if you're running MR jobs on a subset of that
data, you need the regions to be as small as possible otherwise tasks don't
get allocated in parallel much
09:22 ntelford so we've found we have to strike a balance between keeping
them small for MR and keeping them large for HBase to behave well
09:22 ntelford we erred on the side of smaller regions because our MR
issues were more immediate - we couldn't find any documentation or anecdotal
evidence as to why HBase doesn't like lots of regions
{quote}
The three main issues I can think of when having too many regions are:
- mslab requires 2mb per memstore (that's 2mb per family per region). 1000
regions that have 2 families each is 3.9GB of heap used, and it's not even
storing data yet. NB: the 2MB value is configurable.
- if you fill all the regions at somewhat the same rate, the global memory
usage makes it that it forces tiny flushes when you have too many regions
which in turn generates compactions. Rewriting the same data tens of times is
the last thing you want. An example is filling 1000 regions (with one family)
equally and let's consider a lower bound for global memstore usage of 5GB
(the region server would have a big heap). Once it reaches 5GB it will force
flush the biggest region, at that point they should almost all have about 5MB
of data so it would flush that amount. 5MB inserted later, it would flush
another region that will now have a bit over 5MB of data, and so on.
- the new master is allergic to tons of regions, and will take a lot of time
assigning them and moving them around in batches. The reason is that it's
heavy on ZK usage, and it's not very async at the moment (could really be
improved).
Another issue is the effect of the number of regions on mapreduce jobs.
Keeping 5 regions per RS would be too low for a job, whereas 1000 will
generate too many maps. This comes back to ntelford's problem of needing to
scan portions of tables. To solve his problem, we discussed using a custom
input format that generates many splits per region.

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-07 Thread Mikhail Bautin (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4218:
--

Attachment: Delta-encoding.patch-2012-01-07_14_12_48.patch

Attaching a patch rebased on trunk changes.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-v16.txt, 4218.txt, D447.1.patch, 
 D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.14.patch, 
 D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, D447.19.patch, 
 D447.2.patch, D447.20.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182120#comment-13182120
 ] 

stack commented on HBASE-5141:
--

Let me commit

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl


 [ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5141:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5112) TestReplication#queueFailover flaky due to potentially uninitialized Scan


 [ 
https://issues.apache.org/jira/browse/HBASE-5112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5112:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   Status: Resolved  (was: Patch Available)

This was committed a while back.

 TestReplication#queueFailover flaky due to potentially uninitialized Scan
 -

 Key: HBASE-5112
 URL: https://issues.apache.org/jira/browse/HBASE-5112
 Project: HBase
  Issue Type: Test
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.92.0

 Attachments: 5112-v2.txt, hbase-5112.patch, 
 org.apache.hadoop.hbase.replication.TestReplication-output.txt


 In TestReplication#queueFailover, the second scan is not reset for each new 
 scan.  Followed scan may not be able to scan the whole table.
 So it cannot get all the data and the test fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182122#comment-13182122
 ] 

stack commented on HBASE-5141:
--

Committed 0.92 and trunk.

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4357) Region stayed in transition - in closing state


 [ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4357:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   Status: Resolved  (was: Patch Available)

Was committed by Ted a day or so ago.  Resolving.

 Region stayed in transition - in closing state
 --

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0

 Attachments: 4357.txt, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-07 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4218:
---

Attachment: D447.21.patch

mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding 
framework and delta encoding implementation.
Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

  Fixing the -encode_in_cache_only option of LoadTestTool (it is still 
encode_in_cache_only, even though we use ENCODE_ON_DISK in the column 
family), and rebasing on most recent trunk changes. Unit tests still pass.

REVISION DETAIL
  https://reviews.facebook.net/D447

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
  src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182127#comment-13182127
 ] 

Mikhail Bautin commented on HBASE-5141:
---

FYI: The build is broken in trunk because of this patch.

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-07 Thread Hadoop QA (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182129#comment-13182129
]

Hadoop QA commented on HBASE-4218:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12509806/D447.21.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 133 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/697//console

This message is automatically generated.

Data Block Encoding of KeyValues (aka delta encoding / prefix compression)
---

Key: HBASE-4218
URL: https://issues.apache.org/jira/browse/HBASE-4218
Project: HBase
Issue Type: Improvement
Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
Labels: compression
Fix For: 0.94.0

Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch,
0001-Delta-encoding.patch, 4218-v16.txt, 4218.txt, D447.1.patch,
D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.14.patch,
D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, D447.19.patch,
D447.2.patch, D447.20.patch, D447.21.patch, D447.3.patch, D447.4.patch,
D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch,
Data-block-encoding-2011-12-23.patch,
Delta-encoding.patch-2011-12-22_11_52_07.patch,
Delta-encoding.patch-2012-01-05_15_16_43.patch,
Delta-encoding.patch-2012-01-05_16_31_44.patch,
Delta-encoding.patch-2012-01-05_16_31_44_copy.patch,
Delta-encoding.patch-2012-01-05_18_50_47.patch,
Delta-encoding.patch-2012-01-07_14_12_48.patch,
Delta_encoding_with_memstore_TS.patch, open-source.diff

A compression for keys. Keys are sorted in HFile and they are usually very
similar. Because of that, it is possible to design better compression than
general purpose algorithms,
It is an additional step designed to be used in memory. It aims to save
memory in cache as well as speeding seeks within HFileBlocks. It should
improve performance a lot, if key lengths are larger than value lengths. For
example, it makes a lot of sense to use it when value is a counter.
Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes)
shows that I could achieve decent level of compression:
key compression ratio: 92%
total compression ratio: 85%
LZO on the same data: 85%
LZO after delta encoding: 91%
While having much better performance (20-80% faster decompression ratio than
LZO). Moreover, it should allow far more efficient seeking which should
improve performance a bit.
It seems that a simple compression algorithms are good enough. Most of the
savings are due to prefix compression, int128 encoding, timestamp diffs and
bitfields to avoid duplication. That way, comparisons of compressed data can
be much faster than a byte comparator (thanks to prefix compression and
bitfields).
In order to implement it in HBase two important changes in design will be
needed:
-solidify interface to HFileBlock / HFileReader Scanner to provide seeking
and iterating; access to uncompressed buffer in HFileBlock will have bad
performance
-extend comparators to support comparison assuming that N first bytes are
equal (or some fields are equal)
Link to a discussion about something similar:
http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

[jira] [Updated] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)


 [ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5052:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   Status: Resolved  (was: Patch Available)

Committed branch and trunk.  Thanks for the patch Andrei.

 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.92.0

 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5103) Fix improper master znode deserialization


 [ 
https://issues.apache.org/jira/browse/HBASE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5103:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed a while back.  Resolving.

 Fix improper master znode deserialization
 -

 Key: HBASE-5103
 URL: https://issues.apache.org/jira/browse/HBASE-5103
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: hbase-5103.patch


 In ActiveMasterManager#blockUntilBecomingActiveMaster the master znode is 
 created as a versioned serialized version of ServerName
 {code}
  if (ZKUtil.createEphemeralNodeAndWatch(this.watcher,
   this.watcher.masterAddressZNode, sn.getVersionedBytes())) {
 {code}
 There are a few user visible places where it is used but not deserialized 
 properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182132#comment-13182132
 ] 

Mikhail Bautin commented on HBASE-5141:
---

I get the error shown at http://pastebin.com/AdAp0M35 when trying to build the 
following commit: 

Author: stack stack@13f79535-47bb-0310-9956-ffa450edef68
Date:   Sat Jan 7 14:16:11 2012

HBASE-5141 Memory leak in MonitoredRPCHandlerImpl

git-svn-id: http://svn.apache.org/repos/asf/hbase/trunk@1228740


 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-07 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182134#comment-13182134
 ] 

Mikhail Bautin commented on HBASE-4218:
---

@Ted: I was running a load test with LZO compression and PREFIX encoding and 
everything was fine, but then I switched to encoding in cache only and 
compactions started failing. I need to look into this.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-v16.txt, 4218.txt, D447.1.patch, 
 D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.14.patch, 
 D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, D447.19.patch, 
 D447.2.patch, D447.20.patch, D447.21.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182136#comment-13182136
 ] 

Mikhail Bautin commented on HBASE-5141:
---

Correction: use this svn command:

svn diff http://svn.apache.org/repos/asf/hbase/trunk -r1228739


 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182135#comment-13182135
 ] 

Mikhail Bautin commented on HBASE-5141:
---

Actually, the committed patch contains more stuff than the patch attached to 
the JIRA:

svn diff http://svn.apache.org/repos/asf/hbase/trunk -r1228740

Was the security version of the patch committed into trunk or something?

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5134) Remove getRegionServerWithoutRetries and getRegionServerWithRetries from HConnection Interface


 [ 
https://issues.apache.org/jira/browse/HBASE-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5134:
-

Status: Open  (was: Patch Available)

 Remove getRegionServerWithoutRetries and getRegionServerWithRetries from 
 HConnection Interface
 --

 Key: HBASE-5134
 URL: https://issues.apache.org/jira/browse/HBASE-5134
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: stack
 Attachments: 5134-v2.txt, 5134-v3.txt


 Its broke having these meta methods in HConnection.  They take 
 ServerCallables which themselves have HConnections inevitably.   It makes for 
 a tangle in the model and frustrates being able to do mocked implemenations 
 of HConnection.  These methods better belong in something like 
 HConnectionManager, or elsewhere altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5134) Remove getRegionServerWithoutRetries and getRegionServerWithRetries from HConnection Interface


 [ 
https://issues.apache.org/jira/browse/HBASE-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5134:
-

Attachment: 5134-v3.txt

v3 is same as v2 except for one line change in TestAssignmentManager where I 
chance the generic params on a mocked method to be specific (The commit of 
Mings' new closeRegion method broke this).   All good now.

 Remove getRegionServerWithoutRetries and getRegionServerWithRetries from 
 HConnection Interface
 --

 Key: HBASE-5134
 URL: https://issues.apache.org/jira/browse/HBASE-5134
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5134-v2.txt, 5134-v3.txt


 Its broke having these meta methods in HConnection.  They take 
 ServerCallables which themselves have HConnections inevitably.   It makes for 
 a tangle in the model and frustrates being able to do mocked implemenations 
 of HConnection.  These methods better belong in something like 
 HConnectionManager, or elsewhere altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5134) Remove getRegionServerWithoutRetries and getRegionServerWithRetries from HConnection Interface


 [ 
https://issues.apache.org/jira/browse/HBASE-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5134:
-

Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Patch Available  (was: Open)

Trying against hadoopqa to see what is broke.

 Remove getRegionServerWithoutRetries and getRegionServerWithRetries from 
 HConnection Interface
 --

 Key: HBASE-5134
 URL: https://issues.apache.org/jira/browse/HBASE-5134
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5134-v2.txt, 5134-v3.txt


 Its broke having these meta methods in HConnection.  They take 
 ServerCallables which themselves have HConnections inevitably.   It makes for 
 a tangle in the model and frustrates being able to do mocked implemenations 
 of HConnection.  These methods better belong in something like 
 HConnectionManager, or elsewhere altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182139#comment-13182139
 ] 

stack commented on HBASE-5141:
--

Fixing

 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5121) MajorCompaction may affect scan's correctness

2012-01-07 Thread chunhui shen (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182166#comment-13182166
]

chunhui shen commented on HBASE-5121:
-

@Ted
ok, I will make a new patch

MajorCompaction may affect scan's correctness
-

Key: HBASE-5121
URL: https://issues.apache.org/jira/browse/HBASE-5121
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.90.4
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
Fix For: 0.94.0, 0.92.1, 0.90.6

Attachments: 5121-trunk-combined.txt, 5121.90,
hbase-5121-testcase.patch, hbase-5121.patch, hbase-5121v2.patch

In our test, there are two families' keyvalue for one row.
But we could find a infrequent problem when doing scan's next if
majorCompaction happens concurrently.
In the client's two continuous doing scan.next():
1.First time, scan's next returns the result where family A is null.
2.Second time, scan's next returns the result where family B is null.
The two next()'s result have the same row.
If there are more families, I think the scenario will be more strange...
We find the reason is that storescanner.peek() is changed after
majorCompaction if there are delete type KeyValue.
This change causes the PriorityQueueKeyValueScanner of RegionScanner's heap
is not sure to be sorted.

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: (was: 5088-useMapInterfaces.txt)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final.txt, 5088-final2.txt, 5088-final3.txt, 
 5088.generics.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: (was: 5088-syncObj.txt)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final.txt, 5088-final2.txt, 5088-final3.txt, 
 5088.generics.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: (was: 5088-final.txt)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final3.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: (was: 5088-final2.txt)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final3.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: (was: 5088.generics.txt)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-final3.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: 5088-0.90.txt

This is what I committed to 0.90.

While I worked on that I noticed that I can get rid more concrete uses of 
SoftvalueSortedMap in HConnectionManager (in fact all uses except creation, 
which is nice). I'll do an addendum for this in 0.92 and trunk.

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 5088-0.90.txt, 5088-final3.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Fix Version/s: 0.90.6

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-final3.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: 5088-0.92-trunk-addendum.txt

This is the addendum. Now the code is pretty clean w.r.t. using Map interfaces.

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-0.92-trunk-addendum.txt, 
 5088-final3.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2012-01-07 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182175#comment-13182175
 ] 

Lars Hofhansl commented on HBASE-5088:
--

Removed some of the attachment that are just confusing. The addendum also fixes 
minor formatting inconsistencies that I had introduced. All is good now.

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-0.92-trunk-addendum.txt, 
 5088-final3.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5121) MajorCompaction may affect scan's correctness

2012-01-07 Thread chunhui shen (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182178#comment-13182178
 ] 

chunhui shen commented on HBASE-5121:
-

@Ted

If we change StoreScanner.next() to return an enum , we should change all the 
implement of InternalScanner.next(), therefore KeyValueHeap.next()'s return 
should also be changed to an enum. It needs to change the logic who calles 
KeyValueHeap.next() .

I think it changes too much, does it need?

 MajorCompaction may affect scan's correctness
 -

 Key: HBASE-5121
 URL: https://issues.apache.org/jira/browse/HBASE-5121
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5121-trunk-combined.txt, 5121.90, 
 hbase-5121-testcase.patch, hbase-5121.patch, hbase-5121v2.patch


 In our test, there are two families' keyvalue for one row.
 But we could find a infrequent problem when doing scan's next if 
 majorCompaction happens concurrently.
 In the client's two continuous doing scan.next():
 1.First time, scan's next returns the result where family A is null.
 2.Second time, scan's next returns the result where family B is null.
 The two next()'s result have the same row.
 If there are more families, I think the scenario will be more strange...
 We find the reason is that storescanner.peek() is changed after 
 majorCompaction if there are delete type KeyValue.
 This change causes the PriorityQueueKeyValueScanner of RegionScanner's heap 
 is not sure to be sorted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)


[ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182182#comment-13182182
 ] 

Hudson commented on HBASE-5052:
---

Integrated in HBase-0.92 #234 (See 
[https://builds.apache.org/job/HBase-0.92/234/])
HBASE-5052 The path where a dynamically loaded coprocessor jar is copied on 
the local file system depends on the region name (and implicitly, the start key)

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.92.0

 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5121) MajorCompaction may affect scan's correctness

[
https://issues.apache.org/jira/browse/HBASE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182184#comment-13182184
]

Zhihong Yu commented on HBASE-5121:
---

@Chunhui:
I agree.

That's why I think creating a new exception is acceptable.
Maybe Todd or Stack has better idea.

MajorCompaction may affect scan's correctness
-

Attachments: 5121-trunk-combined.txt, 5121.90,
hbase-5121-testcase.patch, hbase-5121.patch, hbase-5121v2.patch

[jira] [Updated] (HBASE-5121) MajorCompaction may affect scan's correctness

[
https://issues.apache.org/jira/browse/HBASE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-5121:
-

Attachment: 5121-suggest.txt

I was looking at the patch a bit. Maybe there is a simpler solution:
You say in that when this scenario happens the KV just vanishes.
What your logic in KeyValueHeap is essentially doing is to retry with the next
KV on the heap.
So, we can just tell the KeyValueHeap that there are more KVs in this case
(returning true for mayContainMoreRows).

With this your test passes.
(it is entirely possible that my reasoning is incorrect, and it just
accidentally lets the test pass).

MajorCompaction may affect scan's correctness
-

Attachments: 5121-suggest.txt, 5121-trunk-combined.txt, 5121.90,
hbase-5121-testcase.patch, hbase-5121.patch, hbase-5121v2.patch

[jira] [Commented] (HBASE-5121) MajorCompaction may affect scan's correctness

2012-01-07 Thread Hadoop QA (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182190#comment-13182190
]

Hadoop QA commented on HBASE-5121:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12509823/5121-suggest.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -151 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 79 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/700//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/700//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/700//console

This message is automatically generated.

MajorCompaction may affect scan's correctness
-

Attachments: 5121-suggest.txt, 5121-trunk-combined.txt, 5121.90,
hbase-5121-testcase.patch, hbase-5121.patch, hbase-5121v2.patch

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2012-01-07 Thread ramkrishna.s.vasudevan (Updated) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182192#comment-13182192
 ] 

Hudson commented on HBASE-5088:
---

Integrated in HBase-0.92 #235 (See 
[https://builds.apache.org/job/HBase-0.92/235/])
HBASE-5088  addendum

larsh : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/SoftValueSortedMap.java


 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-0.92-trunk-addendum.txt, 
 5088-final3.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException


 [ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5137:
--

Attachment: HBASE-5137.patch

Patch for 0.90 addressing Ted's comment of adding braces.  But did not handle 
interrupted exception.
@Ted
Pls check if it is ok.

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: 5137-trunk.txt, HBASE-5137.patch, HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2012-01-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182199#comment-13182199
 ] 

ramkrishna.s.vasudevan commented on HBASE-5088:
---

@Lars
Good on you for committing it to 0.90 :)..

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-0.92-trunk-addendum.txt, 
 5088-final3.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5143) Fix config typo in pluggable load balancer factory

2012-01-07 Thread Harsh J (Created) (JIRA)

Fix config typo in pluggable load balancer factory
--

 Key: HBASE-5143
 URL: https://issues.apache.org/jira/browse/HBASE-5143
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Harsh J
Priority: Critical


HBASE-4240 made LoadBalancer pluggable.

Configuration it loads seems to be wrongly named and carries a typo: 
hbase.maser.loadBalancer.class

Could rather be hbase.master.loadbalancer.class

Luckily 0.92 is not out yet and we should fix it asap, before folks start using 
it. Attaching patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182204#comment-13182204
 ] 

Hudson commented on HBASE-5088:
---

Integrated in HBase-TRUNK-security #67 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/67/])
HBASE-5088  addendum

larsh : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/SoftValueSortedMap.java


 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5088-0.90.txt, 5088-0.92-trunk-addendum.txt, 
 5088-final3.txt, HBase-5088-90.patch, HBase-5088-trunk.patch, 
 HBase5088-90-replaceSoftValueSortedMap.patch, 
 HBase5088-90-replaceTreeMap.patch, HBase5088-trunk-replaceTreeMap.patch, 
 HBase5088Reproduce.java, PerformanceTestResults.png


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)


[ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182203#comment-13182203
 ] 

Hudson commented on HBASE-5052:
---

Integrated in HBase-TRUNK-security #67 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/67/])
HBASE-5052 The path where a dynamically loaded coprocessor jar is copied on 
the local file system depends on the region name (and implicitly, the start key)

stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.92.0

 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5141) Memory leak in MonitoredRPCHandlerImpl

2012-01-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182202#comment-13182202
 ] 

Hudson commented on HBASE-5141:
---

Integrated in HBase-TRUNK-security #67 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/67/])
HBASE-5141 Memory leak in MonitoredRPCHandlerImpl -- REDO
HBASE-5141 Memory leak in MonitoredRPCHandlerImpl -- REVERT. OVER-COMMITTED.  
REVERTING ALL SO CAN REDO COMMIT
HBASE-5141 Memory leak in MonitoredRPCHandlerImpl

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/monitoring/MonitoredRPCHandlerImpl.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/monitoring/MonitoredRPCHandlerImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/monitoring/MonitoredRPCHandlerImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


 Memory leak in MonitoredRPCHandlerImpl
 --

 Key: HBASE-5141
 URL: https://issues.apache.org/jira/browse/HBASE-5141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5141-v2.patch, HBASE-5141.patch, Screen Shot 
 2012-01-06 at 3.03.09 PM.png


 I got a pretty reliable way of OOME'ing my region servers. Using a big 
 payload (64MB in my case), a default heap and default number of handlers, 
 it's not too long that all the MonitoredRPCHandlerImpl hold on a 64MB 
 reference and once a compaction kicks in it kills everything.
 The issue is that even after the RPC call is done, the packet still lives in 
 MonitoredRPCHandlerImpl.
 Will attach a screen shot of jprofiler's analysis in a moment.
 This is a blocker for 0.92.0, anyone using a high number of handlers and 
 bigish values will kill themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4240) Allow Loadbalancer to be pluggable.

2012-01-07 Thread Harsh J (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182205#comment-13182205
 ] 

Harsh J commented on HBASE-4240:


Hi,

This introduced a badly named config. Please see HBASE-5143 for a fix.

 Allow Loadbalancer to be pluggable.
 ---

 Key: HBASE-4240
 URL: https://issues.apache.org/jira/browse/HBASE-4240
 Project: HBase
  Issue Type: New Feature
  Components: master
Affects Versions: 0.94.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.92.0

 Attachments: HBASE-4240-0.patch, HBASE-4240-1.patch, 
 HBASE-4240-2.patch, HBASE-4240-3.patch


 Everyone seems to want something different from a load balancer.  People want 
 low latency, simplicity, and total control.  It seems like at some point the 
 load balancer can't be all things to all people.  Something akin to what 
 hadoop JT's pluggable scheduler seems like it will enable all solutions 
 without making the code much more complex. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5143) Fix config typo in pluggable load balancer factory

2012-01-07 Thread Harsh J (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-5143:
---

Attachment: HBASE-5143.patch

Patch fixes this typo and uses a better name (in sync with other names, no 
camel casing).

Please apply to 0.92 as well.

 Fix config typo in pluggable load balancer factory
 --

 Key: HBASE-5143
 URL: https://issues.apache.org/jira/browse/HBASE-5143
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Harsh J
Priority: Critical
 Attachments: HBASE-5143.patch


 HBASE-4240 made LoadBalancer pluggable.
 Configuration it loads seems to be wrongly named and carries a typo: 
 hbase.maser.loadBalancer.class
 Could rather be hbase.master.loadbalancer.class
 Luckily 0.92 is not out yet and we should fix it asap, before folks start 
 using it. Attaching patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5143) Fix config typo in pluggable load balancer factory


[ 
https://issues.apache.org/jira/browse/HBASE-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182208#comment-13182208
 ] 

ramkrishna.s.vasudevan commented on HBASE-5143:
---

@Harsh
Thanks for the patch. +1

 Fix config typo in pluggable load balancer factory
 --

 Key: HBASE-5143
 URL: https://issues.apache.org/jira/browse/HBASE-5143
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Harsh J
Priority: Critical
 Attachments: HBASE-5143.patch


 HBASE-4240 made LoadBalancer pluggable.
 Configuration it loads seems to be wrongly named and carries a typo: 
 hbase.maser.loadBalancer.class
 Could rather be hbase.master.loadbalancer.class
 Luckily 0.92 is not out yet and we should fix it asap, before folks start 
 using it. Attaching patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException