[jira] [Created] (HBASE-11600) DataInputputStream and DoubleOutputStream are no longer being used

2014-07-28 Thread Shengzhe Yao (JIRA)
Shengzhe Yao created HBASE-11600:


 Summary: DataInputputStream and DoubleOutputStream are no longer 
being used 
 Key: HBASE-11600
 URL: https://issues.apache.org/jira/browse/HBASE-11600
 Project: HBase
  Issue Type: Task
  Components: io
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Trivial
 Fix For: 2.0.0


hbase-server/src/main/java/org/apache/hadoop/hbase/io/DataInputInputStream.java 
and 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/DoubleOutputStream.java 
seems no longer being used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11600) DataInputputStream and DoubleOutputStream are no longer being used

2014-07-28 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11600:
-

Attachment: HBase_11600_v1.patch

 DataInputputStream and DoubleOutputStream are no longer being used 
 ---

 Key: HBASE-11600
 URL: https://issues.apache.org/jira/browse/HBASE-11600
 Project: HBase
  Issue Type: Task
  Components: io
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Trivial
 Fix For: 2.0.0

 Attachments: HBase_11600_v1.patch


 hbase-server/src/main/java/org/apache/hadoop/hbase/io/DataInputInputStream.java
  and 
 hbase-server/src/main/java/org/apache/hadoop/hbase/io/DoubleOutputStream.java 
 seems no longer being used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11600) DataInputputStream and DoubleOutputStream are no longer being used

2014-07-28 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11600:
-

Status: Patch Available  (was: Open)

 DataInputputStream and DoubleOutputStream are no longer being used 
 ---

 Key: HBASE-11600
 URL: https://issues.apache.org/jira/browse/HBASE-11600
 Project: HBase
  Issue Type: Task
  Components: io
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Trivial
 Fix For: 2.0.0

 Attachments: HBase_11600_v1.patch


 hbase-server/src/main/java/org/apache/hadoop/hbase/io/DataInputInputStream.java
  and 
 hbase-server/src/main/java/org/apache/hadoop/hbase/io/DoubleOutputStream.java 
 seems no longer being used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11447) Proposal for a generic transaction API for HBase

2014-07-10 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058002#comment-14058002
 ] 

Shengzhe Yao commented on HBASE-11447:
--

1. Could you please explain TransactionStatus in more details ? It might be 
super helpful if we can also draw some concrete examples how we gonna to use 
these status, this provide a guideline for transaction implementor to use 
correct status in the right way and avoid misinterpretation.

2. As someone mentioned earlier, do we really need to provide TransactionStatus 
in the public API, could this be implementation specific ? 

3. TransactionServiceClient declares all methods as static, could we remove 
that keyword ? 

4. TransactionInterface contains several Transaction(...) methods, looks like 
all these are constructors. It is better to not include them in the interface, 
let implementation to decide the proper way to initialize the object. Even if 
we really want to control the existence of some properties, it is better to 
define settergetter or provide a base abstract implementation. 

5. Should TransactionInterface.commit() and TransactionInterface.rollback() 
have return values ?

6. TransactionInterface.setTransactionTimeout should be better to have extra 
TimeUnit parameter, so that user won't be confused by timeout resolution (the 
meaning of timeout could be milliseconds, seconds or even hours).

7. TransactionInterface.toByteArray, instead of this, maybe we can add one more 
method that accepts an OutputStream, so that it might allow more efficient 
implementation of Transaction serialization.

8. It might be better to replace TransactionTable.setTransaction by 
TransactionTable.addTransaction, because I think most of time there is not only 
a single transaction but multiple of them can happen.  

[~cuijianwei] recently announced a transaction implementation based on HBase, 
called Themis which is inspired by Google's percolator. I'd like to hear if you 
([~cuijianwei]) have comments for this transaction API.

 Proposal for a generic transaction API for HBase
 

 Key: HBASE-11447
 URL: https://issues.apache.org/jira/browse/HBASE-11447
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 1.0.0
 Environment: Any.
Reporter: John de Roo
Priority: Minor
  Labels: features, newbie
 Fix For: 1.0.0

 Attachments: Proposal for a common transactional API for HBase 
 v0.3_1.pdf, Proposal for a common transactional API for HBase v0.4_1.pdf, Re 
 Proposal for a generic transaction API for HBase.htm


 HBase transaction management today is provided by a number of products, each 
 implementing a different API, each having different strengths.  The lack of a 
 common API for transactional interfaces means that applications need to be 
 coded to work with a specific Transaction Manager.  This proposal outlines an 
 API which, if implemented by the different Transaction Manager vendors would 
 provide stability and choice to HBase application developers.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11447) Proposal for a generic transaction API for HBase

2014-07-10 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058335#comment-14058335
 ] 

Shengzhe Yao commented on HBASE-11447:
--

Oh, I think I am more interested in doing multiple transactions in a single 
TransactionTable instance. Think about a user do multi-row transaction, if two 
transactions operate two different set of rows, we could execute them together 
without creating another TransactionTable instance.

 Proposal for a generic transaction API for HBase
 

 Key: HBASE-11447
 URL: https://issues.apache.org/jira/browse/HBASE-11447
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 1.0.0
 Environment: Any.
Reporter: John de Roo
Priority: Minor
  Labels: features, newbie
 Fix For: 1.0.0

 Attachments: Proposal for a common transactional API for HBase 
 v0.3_1.pdf, Proposal for a common transactional API for HBase v0.4_1.pdf, Re 
 Proposal for a generic transaction API for HBase.htm


 HBase transaction management today is provided by a number of products, each 
 implementing a different API, each having different strengths.  The lack of a 
 common API for transactional interfaces means that applications need to be 
 coded to work with a specific Transaction Manager.  This proposal outlines an 
 API which, if implemented by the different Transaction Manager vendors would 
 provide stability and choice to HBase application developers.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11487) ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest with close_scanner = true)

2014-07-09 Thread Shengzhe Yao (JIRA)
Shengzhe Yao created HBASE-11487:


 Summary: ScanResponse carries non-zero cellblock for 
CloseScanRequest (ScanRequest with close_scanner = true)
 Key: HBASE-11487
 URL: https://issues.apache.org/jira/browse/HBASE-11487
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC, regionserver
Affects Versions: 0.96.2, 0.99.0, 2.0.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 2.0.0


After upgrading hbase from 0.94 to 0.96, we've found that our asynchbase client 
keep throwing errors during normal scan. It turns out these errors are due to 
Scanner.close call in asynchbase. Since asynchbase assumes the ScanResponse of 
CloseScannerRequest should never carry any cellblocks, it will throw an 
exception if there is a violation.

In the asynchbase client (1.5.0), it constructs a CloseScannerRequest in the 
following way,  
   ScanRequest.newBuilder()
.setScannerId(scanner_id)
.setCloseScanner(true)
.build();
Note, it does not set numOfRows, which kind of make sense. Why a close scanner 
request cares about number of rows to scan ?

However, after narrowing down the CloseScannerRequest code path, it seems the 
issue is on regionserver side. In RsRpcServices.scan, we always init numOfRows 
to scan to 1 and we do this even for ScanRequest with close_scanner = true. 
This causes response for CloseScannerRequest will carry a cellBlock (if scan 
stops before the end row and this could happen in many normal scenarios)

There are two fixes, either we always set numOfRows in asynchbase client side 
when constructing a CloseScannerRequest or we fix the default value in the 
server side.

From a hbase client side point of view, it seems make less sense that server 
will send you a cellBlock for your close scanner request, unless the request 
explicitly asks for. 

We've made the change in our server code and the asynchbase client errors goes 
away. 

In addition to this issue, I want to know if we have any specifications for our 
hbase rpc. Like if close_scanner = true in ScanRequest and numOfRows is not 
set, ScanResponse guarantees that there is no cellBlock in the response. Since 
we moved to protobuf and many fields are optional for compatibility 
consideration, it might be helpful to have such specification which helps 
people to develop code that depends on hbase rpc. 




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11487) ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest with close_scanner = true)

2014-07-09 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11487:
-

Attachment: HBase_11487_v1.patch

 ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest 
 with close_scanner = true)
 

 Key: HBASE-11487
 URL: https://issues.apache.org/jira/browse/HBASE-11487
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC, regionserver
Affects Versions: 0.96.2, 0.99.0, 2.0.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBase_11487_v1.patch


 After upgrading hbase from 0.94 to 0.96, we've found that our asynchbase 
 client keep throwing errors during normal scan. It turns out these errors are 
 due to Scanner.close call in asynchbase. Since asynchbase assumes the 
 ScanResponse of CloseScannerRequest should never carry any cellblocks, it 
 will throw an exception if there is a violation.
 In the asynchbase client (1.5.0), it constructs a CloseScannerRequest in the 
 following way,  
ScanRequest.newBuilder()
 .setScannerId(scanner_id)
 .setCloseScanner(true)
 .build();
 Note, it does not set numOfRows, which kind of make sense. Why a close 
 scanner request cares about number of rows to scan ?
 However, after narrowing down the CloseScannerRequest code path, it seems the 
 issue is on regionserver side. In RsRpcServices.scan, we always init 
 numOfRows to scan to 1 and we do this even for ScanRequest with close_scanner 
 = true. This causes response for CloseScannerRequest will carry a cellBlock 
 (if scan stops before the end row and this could happen in many normal 
 scenarios)
 There are two fixes, either we always set numOfRows in asynchbase client side 
 when constructing a CloseScannerRequest or we fix the default value in the 
 server side.
 From a hbase client side point of view, it seems make less sense that server 
 will send you a cellBlock for your close scanner request, unless the request 
 explicitly asks for. 
 We've made the change in our server code and the asynchbase client errors 
 goes away. 
 In addition to this issue, I want to know if we have any specifications for 
 our hbase rpc. Like if close_scanner = true in ScanRequest and numOfRows is 
 not set, ScanResponse guarantees that there is no cellBlock in the response. 
 Since we moved to protobuf and many fields are optional for compatibility 
 consideration, it might be helpful to have such specification which helps 
 people to develop code that depends on hbase rpc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11487) ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest with close_scanner = true)

2014-07-09 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11487:
-

Status: Patch Available  (was: Open)

 ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest 
 with close_scanner = true)
 

 Key: HBASE-11487
 URL: https://issues.apache.org/jira/browse/HBASE-11487
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC, regionserver
Affects Versions: 0.96.2, 0.99.0, 2.0.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBase_11487_v1.patch


 After upgrading hbase from 0.94 to 0.96, we've found that our asynchbase 
 client keep throwing errors during normal scan. It turns out these errors are 
 due to Scanner.close call in asynchbase. Since asynchbase assumes the 
 ScanResponse of CloseScannerRequest should never carry any cellblocks, it 
 will throw an exception if there is a violation.
 In the asynchbase client (1.5.0), it constructs a CloseScannerRequest in the 
 following way,  
ScanRequest.newBuilder()
 .setScannerId(scanner_id)
 .setCloseScanner(true)
 .build();
 Note, it does not set numOfRows, which kind of make sense. Why a close 
 scanner request cares about number of rows to scan ?
 However, after narrowing down the CloseScannerRequest code path, it seems the 
 issue is on regionserver side. In RsRpcServices.scan, we always init 
 numOfRows to scan to 1 and we do this even for ScanRequest with close_scanner 
 = true. This causes response for CloseScannerRequest will carry a cellBlock 
 (if scan stops before the end row and this could happen in many normal 
 scenarios)
 There are two fixes, either we always set numOfRows in asynchbase client side 
 when constructing a CloseScannerRequest or we fix the default value in the 
 server side.
 From a hbase client side point of view, it seems make less sense that server 
 will send you a cellBlock for your close scanner request, unless the request 
 explicitly asks for. 
 We've made the change in our server code and the asynchbase client errors 
 goes away. 
 In addition to this issue, I want to know if we have any specifications for 
 our hbase rpc. Like if close_scanner = true in ScanRequest and numOfRows is 
 not set, ScanResponse guarantees that there is no cellBlock in the response. 
 Since we moved to protobuf and many fields are optional for compatibility 
 consideration, it might be helpful to have such specification which helps 
 people to develop code that depends on hbase rpc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11487) ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest with close_scanner = true)

2014-07-09 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11487:
-

Attachment: HBase_11487_v2.patch

 ScanResponse carries non-zero cellblock for CloseScanRequest (ScanRequest 
 with close_scanner = true)
 

 Key: HBASE-11487
 URL: https://issues.apache.org/jira/browse/HBASE-11487
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC, regionserver
Affects Versions: 0.96.2, 0.99.0, 2.0.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBase_11487_v1.patch, HBase_11487_v2.patch


 After upgrading hbase from 0.94 to 0.96, we've found that our asynchbase 
 client keep throwing errors during normal scan. It turns out these errors are 
 due to Scanner.close call in asynchbase. Since asynchbase assumes the 
 ScanResponse of CloseScannerRequest should never carry any cellblocks, it 
 will throw an exception if there is a violation.
 In the asynchbase client (1.5.0), it constructs a CloseScannerRequest in the 
 following way,  
ScanRequest.newBuilder()
 .setScannerId(scanner_id)
 .setCloseScanner(true)
 .build();
 Note, it does not set numOfRows, which kind of make sense. Why a close 
 scanner request cares about number of rows to scan ?
 However, after narrowing down the CloseScannerRequest code path, it seems the 
 issue is on regionserver side. In RsRpcServices.scan, we always init 
 numOfRows to scan to 1 and we do this even for ScanRequest with close_scanner 
 = true. This causes response for CloseScannerRequest will carry a cellBlock 
 (if scan stops before the end row and this could happen in many normal 
 scenarios)
 There are two fixes, either we always set numOfRows in asynchbase client side 
 when constructing a CloseScannerRequest or we fix the default value in the 
 server side.
 From a hbase client side point of view, it seems make less sense that server 
 will send you a cellBlock for your close scanner request, unless the request 
 explicitly asks for. 
 We've made the change in our server code and the asynchbase client errors 
 goes away. 
 In addition to this issue, I want to know if we have any specifications for 
 our hbase rpc. Like if close_scanner = true in ScanRequest and numOfRows is 
 not set, ScanResponse guarantees that there is no cellBlock in the response. 
 Since we moved to protobuf and many fields are optional for compatibility 
 consideration, it might be helpful to have such specification which helps 
 people to develop code that depends on hbase rpc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11433) LruBlockCache does not respect its configurable parameters

2014-06-27 Thread Shengzhe Yao (JIRA)
Shengzhe Yao created HBASE-11433:


 Summary: LruBlockCache does not respect its configurable parameters
 Key: HBASE-11433
 URL: https://issues.apache.org/jira/browse/HBASE-11433
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.3, 0.96.2, 0.99.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
 Fix For: 0.99.0


We've upgraded our production cluster from 0.94.15 to 0.96.2 few days ago and 
observed increased GC frequency and occasionally full GC (we never had full GC 
before with G1 GC), which leads to famous juliet pause...

After digging into several HBase metrics, we've found that block cache used 
much higher memory in 0.96. It turns out due to patch: HBASE-6312, which not 
only make a few block cache parameter configurable, but also change their 
default values! It is obvious that we need to set these parameters back to the 
old value before considering reduce block cache size or tuning our GC. However, 
we are surprised that there is no change in regionserver side and we are still 
observing high block cache usage.

At the end of the day, it seems in CacheConfig.java, we initialize 
LruBlockCache with default constructor: LruBlockCache(long maxSize, long 
blockSize), which underlying always use the default values. We think this is a 
bug and we should always use another constructor: LruBlockCache(long maxSize, 
long blockSize, boolean evictionThread, Configuration conf) in CacheConfig.java

We made the change and tested on one of our servers, it works and now GC 
problem disappears. Of course, we have to review our hbase and GC 
configurations and find the best configuration under 0.96 for our application. 
But first, we feel the constructor misuse in CacheConfig.java should be fixed.  
  
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11433) LruBlockCache does not respect its configurable parameters

2014-06-27 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11433:
-

Priority: Minor  (was: Major)

 LruBlockCache does not respect its configurable parameters
 --

 Key: HBASE-11433
 URL: https://issues.apache.org/jira/browse/HBASE-11433
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.2, 0.99.0, 0.98.3
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBase_11433_v1.patch


 We've upgraded our production cluster from 0.94.15 to 0.96.2 few days ago and 
 observed increased GC frequency and occasionally full GC (we never had full 
 GC before with G1 GC), which leads to famous juliet pause...
 After digging into several HBase metrics, we've found that block cache used 
 much higher memory in 0.96. It turns out due to patch: HBASE-6312, which not 
 only make a few block cache parameter configurable, but also change their 
 default values! It is obvious that we need to set these parameters back to 
 the old value before considering reduce block cache size or tuning our GC. 
 However, we are surprised that there is no change in regionserver side and we 
 are still observing high block cache usage.
 At the end of the day, it seems in CacheConfig.java, we initialize 
 LruBlockCache with default constructor: LruBlockCache(long maxSize, long 
 blockSize), which underlying always use the default values. We think this is 
 a bug and we should always use another constructor: LruBlockCache(long 
 maxSize, long blockSize, boolean evictionThread, Configuration conf) in 
 CacheConfig.java
 We made the change and tested on one of our servers, it works and now GC 
 problem disappears. Of course, we have to review our hbase and GC 
 configurations and find the best configuration under 0.96 for our 
 application. But first, we feel the constructor misuse in CacheConfig.java 
 should be fixed.
   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11433) LruBlockCache does not respect its configurable parameters

2014-06-27 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11433:
-

Attachment: HBase_11433_v1.patch

 LruBlockCache does not respect its configurable parameters
 --

 Key: HBASE-11433
 URL: https://issues.apache.org/jira/browse/HBASE-11433
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.2, 0.99.0, 0.98.3
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBase_11433_v1.patch


 We've upgraded our production cluster from 0.94.15 to 0.96.2 few days ago and 
 observed increased GC frequency and occasionally full GC (we never had full 
 GC before with G1 GC), which leads to famous juliet pause...
 After digging into several HBase metrics, we've found that block cache used 
 much higher memory in 0.96. It turns out due to patch: HBASE-6312, which not 
 only make a few block cache parameter configurable, but also change their 
 default values! It is obvious that we need to set these parameters back to 
 the old value before considering reduce block cache size or tuning our GC. 
 However, we are surprised that there is no change in regionserver side and we 
 are still observing high block cache usage.
 At the end of the day, it seems in CacheConfig.java, we initialize 
 LruBlockCache with default constructor: LruBlockCache(long maxSize, long 
 blockSize), which underlying always use the default values. We think this is 
 a bug and we should always use another constructor: LruBlockCache(long 
 maxSize, long blockSize, boolean evictionThread, Configuration conf) in 
 CacheConfig.java
 We made the change and tested on one of our servers, it works and now GC 
 problem disappears. Of course, we have to review our hbase and GC 
 configurations and find the best configuration under 0.96 for our 
 application. But first, we feel the constructor misuse in CacheConfig.java 
 should be fixed.
   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11433) LruBlockCache does not respect its configurable parameters

2014-06-27 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11433:
-

Status: Patch Available  (was: Open)

 LruBlockCache does not respect its configurable parameters
 --

 Key: HBASE-11433
 URL: https://issues.apache.org/jira/browse/HBASE-11433
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.3, 0.96.2, 0.99.0
Reporter: Shengzhe Yao
Assignee: Shengzhe Yao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBase_11433_v1.patch


 We've upgraded our production cluster from 0.94.15 to 0.96.2 few days ago and 
 observed increased GC frequency and occasionally full GC (we never had full 
 GC before with G1 GC), which leads to famous juliet pause...
 After digging into several HBase metrics, we've found that block cache used 
 much higher memory in 0.96. It turns out due to patch: HBASE-6312, which not 
 only make a few block cache parameter configurable, but also change their 
 default values! It is obvious that we need to set these parameters back to 
 the old value before considering reduce block cache size or tuning our GC. 
 However, we are surprised that there is no change in regionserver side and we 
 are still observing high block cache usage.
 At the end of the day, it seems in CacheConfig.java, we initialize 
 LruBlockCache with default constructor: LruBlockCache(long maxSize, long 
 blockSize), which underlying always use the default values. We think this is 
 a bug and we should always use another constructor: LruBlockCache(long 
 maxSize, long blockSize, boolean evictionThread, Configuration conf) in 
 CacheConfig.java
 We made the change and tested on one of our servers, it works and now GC 
 problem disappears. Of course, we have to review our hbase and GC 
 configurations and find the best configuration under 0.96 for our 
 application. But first, we feel the constructor misuse in CacheConfig.java 
 should be fixed.
   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-05-21 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005054#comment-14005054
 ] 

Shengzhe Yao commented on HBASE-7386:
-

Any updates ? Are we going to merge the latest patch ? This is a pretty cool 
feature and please let me know if there are things I can help :) 

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch, 
 HBASE-7386-bin.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, 
 HBASE-7386-conf.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
 supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11156) Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available

2014-05-16 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999309#comment-13999309
 ] 

Shengzhe Yao commented on HBASE-11156:
--

@Jiten Please send an email to u...@hbase.apache.org with detailed 
description of your problem. Fire a jira for this type of issue is not desired, 
in addition, it might be a good idea to google your problem first.

  Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use 
 io.native.lib.available
 -

 Key: HBASE-11156
 URL: https://issues.apache.org/jira/browse/HBASE-11156
 Project: HBase
  Issue Type: Bug
  Components: Admin
Affects Versions: 0.96.1.1
Reporter: Jiten
Priority: Critical

 # hbase shell
 2014-05-13 14:51:41,582 INFO  [main] Configuration.deprecation: 
 hadoop.native.lib is deprecated. Instead, use io.native.lib.available
 HBase Shell; enter 'helpRETURN' for list of supported commands.
 Type exitRETURN to leave the HBase Shell
 Version 0.96.1.1-cdh5.0.0, rUnknown, Thu Mar 27 23:01:59 PDT 2014.
 Not able to create table in Hbase. Please help



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-15 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11139:
-

Attachment: HBASE-11139-v2.patch

Add test case for BoundedPriorityBlockingQueue#poll and #poll(timeout, unit)

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch, HBASE-11139-v2.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-15 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11139:
-

Affects Version/s: 0.99.0
   Status: Patch Available  (was: Open)

update nanos for every notEmpty.awaitNanos call in 
BoundedPriorityBlockingQueue#poll() to respect given timeout.

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-15 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11139:
-

Attachment: HBASE-11139-v1.patch

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-12 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994271#comment-13994271
 ] 

Shengzhe Yao commented on HBASE-11139:
--

This patch has noting to do with test case: TestMultiParallel which failed due 
to timeout. TestMultiParallel passed on my laptop, please let me know if I have 
to resubmit the patch or rerun the test.

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Assignee: Shengzhe Yao
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch, HBASE-11139-v2.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-11 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11139:
-

Attachment: HBASE-11139-v3.patch

Set timeout to be 1ms and replace poll() to poll(timeout) for test case: 
testPollInExecutor. 

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Assignee: Shengzhe Yao
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch, HBASE-11139-v2.patch, 
 HBASE-11139-v3.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11139) BoundedPriorityBlockingQueue#poll() should check the return value from awaitNanos()

2014-05-11 Thread Shengzhe Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengzhe Yao updated HBASE-11139:
-

Attachment: HBASE-11139-v4.patch

 just one more question, do you think that we need an 
 executor.awaitTermination() after shutdown()?

Good point, we should let executor wait for some time after shutdown in case 
there is an implementation error that causes poll(timeout) never return. 
Although we've set timeout for this test case, adding 
executor.awaitTermination() makes our test purpose more explicit.

 BoundedPriorityBlockingQueue#poll() should check the return value from 
 awaitNanos()
 ---

 Key: HBASE-11139
 URL: https://issues.apache.org/jira/browse/HBASE-11139
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Ted Yu
Assignee: Shengzhe Yao
Priority: Minor
  Labels: noob
 Attachments: HBASE-11139-v1.patch, HBASE-11139-v2.patch, 
 HBASE-11139-v3.patch, HBASE-11139-v4.patch


 nanos represents the timeout value.
 {code}
   while (queue.size() == 0  nanos  0) {
 notEmpty.awaitNanos(nanos);
   }
 {code}
 The return value from awaitNanos() should be checked - otherwise we may wait 
 for period longer than the timeout value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)