date:20120312

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227362#comment-13227362
]

Phabricator commented on HBASE-5542:

sc has commented on the revision HBASE-5542 [jira] Unify
HRegion.mutateRowsWithLocks() and HRegion.processRow().

@Lars: Sorry. I also missed your previous reply in the JIRA.
The JIRA is kind of flooded by HadoopQA and Phabricator and becomes harder to
find things.

For the doMiniBatchPut, I think it will be better if we can do it in another
patch. It will be easier to review and actually move faster.

About the time bound, I think it makes sense to leave it to the application
to decide.
What do you think about this, Ted?

REVISION DETAIL
https://reviews.facebook.net/D2217

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch,
HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch,
HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch

mutateRowsWithLocks() does atomic mutations on multiple rows.
processRow() does atomic read-modify-writes on a single row.
It will be useful to generalize both and have a
processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Attachment: nochange.patch

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Open  (was: Patch Available)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Patch Available  (was: Open)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227436#comment-13227436
 ] 

Hadoop QA commented on HBASE-5399:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517999/nochange.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -123 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 159 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1162//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1162//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1162//console

This message is automatically generated.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread nkeywal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227455#comment-13227455
 ] 

nkeywal commented on HBASE-5399:


Ok, even an empty patch gets these errors from hadoop-qa, so the v42 can be 
committed imho.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId


 [ 
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5563:
--

Status: Patch Available  (was: Open)

 HRegionInfo#compareTo add the comparison of regionId
 

 Key: HBASE-5563
 URL: https://issues.apache.org/jira/browse/HBASE-5563
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5563.patch


 In the one region multi assigned case,  we could find that two regions have 
 the same table name, same startKey, same endKey, and different regionId, so 
 these two regions are same in TreeMap but different in HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227569#comment-13227569
 ] 

Ted Yu commented on HBASE-5542:
---

I think we should keep time bound whose default value can be large.

 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

2012-03-12 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-5542:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517843/HBASE-5542.D2217.6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -120 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 158 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1156//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1156//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1156//console

This message is automatically generated.)

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch,
HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch,
HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch

[jira] [Updated] (HBASE-5520) Support reseek() at RegionScanner


 [ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5520:
--

Attachment: HBASE-5520_3.patch

 Support reseek() at RegionScanner
 -

 Key: HBASE-5520
 URL: https://issues.apache.org/jira/browse/HBASE-5520
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
 HBASE-5520_3.patch


 reseek() is not supported currently at the RegionScanner level. We can 
 support the same.
 This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner

2012-03-12 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227581#comment-13227581
 ] 

ramkrishna.s.vasudevan commented on HBASE-5520:
---

@Lars and @Stack
Updated the api to take row instead of kv.
@Lars
Yes SEEK_NEXT_USING_HINT is a definite gain but this reseek will help to avoid 
that one problem that filters have to reseek one more kv incase where we know 
the row that we need to seek to.

Just a thought of me and Anoop.  Pls let us know your suggestions and comments.

 Support reseek() at RegionScanner
 -

 Key: HBASE-5520
 URL: https://issues.apache.org/jira/browse/HBASE-5520
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
 HBASE-5520_3.patch


 reseek() is not supported currently at the RegionScanner level. We can 
 support the same.
 This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227582#comment-13227582
 ] 

Hadoop QA commented on HBASE-5520:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518028/HBASE-5520_3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1164//console

This message is automatically generated.

 Support reseek() at RegionScanner
 -

 Key: HBASE-5520
 URL: https://issues.apache.org/jira/browse/HBASE-5520
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
 HBASE-5520_3.patch


 reseek() is not supported currently at the RegionScanner level. We can 
 support the same.
 This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId

2012-03-12 Thread ramkrishna.s.vasudevan (Issue Comment Edited) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227587#comment-13227587
]

Hadoop QA commented on HBASE-5563:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12517985/HBASE-5563.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -123 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 159 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.master.TestRollingRestart
org.apache.hadoop.hbase.io.hfile.TestLruBlockCache
org.apache.hadoop.hbase.coprocessor.TestClassLoading
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
org.apache.hadoop.hbase.client.TestShell

org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable

org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1163//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1163//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1163//console

This message is automatically generated.

HRegionInfo#compareTo add the comparison of regionId

Key: HBASE-5563
URL: https://issues.apache.org/jira/browse/HBASE-5563
Project: HBase
Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
Attachments: HBASE-5563.patch

In the one region multi assigned case, we could find that two regions have
the same table name, same startKey, same endKey, and different regionId, so
these two regions are same in TreeMap but different in HashMap.

[jira] [Created] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-12 Thread Laxman (Created) (JIRA)

Bulkload is discarding duplicate records


 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman


Duplicate records are getting discarded when duplicate records exists in same 
input file and more specifically if they exists in same split.
Duplicate records are considered if the records are from diffrent different 
splits.

Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-12 Thread Laxman (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227597#comment-13227597
 ] 

Laxman commented on HBASE-5564:
---

I think this is a bug and its not any intentional behavior. 

Usage of TreeSet in the below code snippet is causing the issue.

PutSortReducer.reduce()
==
  TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR);
  long curSize = 0;
  // stop at the end or the RAM threshold
  while (iter.hasNext()  curSize  threshold) {
Put p = iter.next();
for (ListKeyValue kvs : p.getFamilyMap().values()) {
  for (KeyValue kv : kvs) {
map.add(kv);
curSize += kv.getLength();
  }
}

Changing this back to List and then sort explicitly will solve the issue.

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5564) Bulkload is discarding duplicate records


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227597#comment-13227597
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-5564 at 3/12/12 3:21 PM:


I think this is a bug and its not any intentional behavior. 

Usage of TreeSet in the below code snippet is causing the issue.

PutSortReducer.reduce()
==
{code}
  TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR);
  long curSize = 0;
  // stop at the end or the RAM threshold
  while (iter.hasNext()  curSize  threshold) {
Put p = iter.next();
for (ListKeyValue kvs : p.getFamilyMap().values()) {
  for (KeyValue kv : kvs) {
map.add(kv);
curSize += kv.getLength();
  }
}
{code}
Changing this back to List and then sort explicitly will solve the issue.

  was (Author: lakshman):
I think this is a bug and its not any intentional behavior. 

Usage of TreeSet in the below code snippet is causing the issue.

PutSortReducer.reduce()
==
  TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR);
  long curSize = 0;
  // stop at the end or the RAM threshold
  while (iter.hasNext()  curSize  threshold) {
Put p = iter.next();
for (ListKeyValue kvs : p.getFamilyMap().values()) {
  for (KeyValue kv : kvs) {
map.add(kv);
curSize += kv.getLength();
  }
}

Changing this back to List and then sort explicitly will solve the issue.
  
 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ashutosh Jindal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Jindal updated HBASE-5206:
---

Attachment: 5206_trunk_latest_1.patch

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ashutosh Jindal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Jindal updated HBASE-5206:
---

Attachment: 5206_92_latest_1.patch

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ashutosh Jindal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227636#comment-13227636
 ] 

Ashutosh Jindal commented on HBASE-5206:


Updated the formatting issues. 
bq.Can we remember the tables we have enabled so that the number of calls to 
setEnabledTable() is reduced ?
Only for the first time to enable the table , we need to go the zookeeper and 
for all the other calls to check the table enable state , we are caching the 
states in ZKTable.java.
Please correct me if I am wrong.
Please provide your suggestions/comments.

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId


 [ 
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5563:
--

Status: Open  (was: Patch Available)

 HRegionInfo#compareTo add the comparison of regionId
 

 Key: HBASE-5563
 URL: https://issues.apache.org/jira/browse/HBASE-5563
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5563.patch


 In the one region multi assigned case,  we could find that two regions have 
 the same table name, same startKey, same endKey, and different regionId, so 
 these two regions are same in TreeMap but different in HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK


[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227661#comment-13227661
 ] 

Ted Yu commented on HBASE-5206:
---

Patch v2 looks good.
Minor comments:
{code}
 // Call to undisableTable does this. TODO: Make a more formal purge table.
-am.getZKTable().setEnabledTable(Bytes.toString(tableName));
+am.getZKTable().setDeletedTable(Bytes.toString(tableName));
{code}
I don't see undisableTable. Can we remove the comment above ?
{code}
+  } else if (!this.zkTable
+  .isEnabledTable(region.getTableNameAsString())) {
+setEnabledTable(region);
{code}
setEnabledTable(HRegionInfo hri) already calls zkTable.isEnabledTable(). It 
seems we can call setEnabledTable(region) directly above.

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227665#comment-13227665
 ] 

Phabricator commented on HBASE-5542:


lhofhansl has commented on the revision HBASE-5542 [jira] Unify 
HRegion.mutateRowsWithLocks() and HRegion.processRow().

  @Scott: Yes, let's use another jira for doMiniBatchPut.
  For the timebound logic, at the very least there has to be an option to not 
do that for mutateRowsWithLocks, as spawning another thread for that would be 
too expensive.

REVISION DETAIL
  https://reviews.facebook.net/D2217


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK


[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227669#comment-13227669
 ] 

Ted Yu commented on HBASE-5206:
---

{code}
+  String errorMsg = Unable to ensure that the table  + tableName
+  + will be +  enabled because of a ZooKeeper issue;
{code}
A space should be added between  and will.

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

[
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227674#comment-13227674
]

Hadoop QA commented on HBASE-5206:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12518039/5206_trunk_latest_1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 15 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -123 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 159 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.TestDrainingServer
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1165//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1165//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1165//console

This message is automatically generated.

Port HBASE-5155 to 0.92 and TRUNK
-

Key: HBASE-5206
URL: https://issues.apache.org/jira/browse/HBASE-5206
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
Attachments: 5206_92_1.patch, 5206_92_latest_1.patch,
5206_trunk_1.patch, 5206_trunk_latest_1.patch

This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should
not happen parallely leading to recreation of regions that were deleted) to
0.92 and TRUNK

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-12 Thread Laxman (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227678#comment-13227678
]

Laxman commented on HBASE-5564:
---

I tested again with the proposed patch.
Changing this back to List and then sort explicitly will solve the issue.

Still the same problem persists making this issue bit more complicated.
I think the usage of same timestamp for all records in split causing the issue.

Currently in code,
a) If configured, we are using static timestamp for all mappers.
b) If not configured, we are using current system time generated for each split.

TsvImporterMapper.doSetup

{code}
ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, System.currentTimeMillis());
{code}

Should we think of an approach to generate a unique sequence number and use it
as a timestamp?

Any other thoughts?

Bulkload is discarding duplicate records

Key: HBASE-5564
URL: https://issues.apache.org/jira/browse/HBASE-5564
Project: HBase
Issue Type: Bug
Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
Labels: bulkloader

Duplicate records are getting discarded when duplicate records exists in same
input file and more specifically if they exists in same split.
Duplicate records are considered if the records are from diffrent different
splits.
Version under test: HBase 0.92

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK


[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227688#comment-13227688
 ] 

Ted Yu commented on HBASE-5206:
---

The following error is reproducible on MacBook (patch for 0.92):
{code}
Tests in error: 
  org.apache.hadoop.hbase.TestDrainingServer: 
org.apache.hadoop.hbase.TableNotEnabledException: t
{code}

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-12 Thread Jesse Yates (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227696#comment-13227696
 ] 

Jesse Yates commented on HBASE-5564:


Hmm, I think your right with this being a problem. It would be totally 
reasonable to change 
{code}
   KeyValue kv = new KeyValue(
lineBytes, parsed.getRowKeyOffset(), parsed.getRowKeyLength(),
parser.getFamily(i), 0, parser.getFamily(i).length,
parser.getQualifier(i), 0, parser.getQualifier(i).length,
ts,
KeyValue.Type.Put,
lineBytes, parsed.getColumnOffset(i), parsed.getColumnLength(i));
{code}

to use something like: {code}ts++{code}

The question is, if you have a TSV file with the same row key, which value 
should be considered the most recent version? Should any of them - maybe that 
is actually a problem and we want to have a warning/error when that occurs?

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5565) Refactoring doMiniBatchPut()

2012-03-12 Thread Scott Chen (Created) (JIRA)

Refactoring doMiniBatchPut()


 Key: HBASE-5565
 URL: https://issues.apache.org/jira/browse/HBASE-5565
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Lars Hofhansl


HBASE-5542 introduce a RowProcessor interface that can handle multi-row 
transaction.
It will make the code cleaner if we can make doMiniBatchPut() to be covered by 
RowProcessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227724#comment-13227724
]

Phabricator commented on HBASE-5542:

sc has commented on the revision HBASE-5542 [jira] Unify
HRegion.mutateRowsWithLocks() and HRegion.processRow().

@lhofhansl: I have created a jira for you :)
https://issues.apache.org/jira/browse/HBASE-5565
Thank you for volunteer on working on it.

I will make the time bound optional. Will update the patch soon.

REVISION DETAIL
https://reviews.facebook.net/D2217

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch,
HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch,
HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227728#comment-13227728
 ] 

Phabricator commented on HBASE-5542:


sc has commented on the revision HBASE-5542 [jira] Unify 
HRegion.mutateRowsWithLocks() and HRegion.processRow().

  @Ted: I will also make the time bound default value to be large.

REVISION DETAIL
  https://reviews.facebook.net/D2217


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5565) Refactoring doMiniBatchPut()

2012-03-12 Thread Scott Chen (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen reassigned HBASE-5565:
-

Assignee: Scott Chen  (was: Lars Hofhansl)

 Refactoring doMiniBatchPut()
 

 Key: HBASE-5565
 URL: https://issues.apache.org/jira/browse/HBASE-5565
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen

 HBASE-5542 introduce a RowProcessor interface that can handle multi-row 
 transaction.
 It will make the code cleaner if we can make doMiniBatchPut() to be covered 
 by RowProcessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5565) Refactoring doMiniBatchPut()

2012-03-12 Thread Scott Chen (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated HBASE-5565:
--

Assignee: Lars Hofhansl  (was: Scott Chen)

 Refactoring doMiniBatchPut()
 

 Key: HBASE-5565
 URL: https://issues.apache.org/jira/browse/HBASE-5565
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Lars Hofhansl

 HBASE-5542 introduce a RowProcessor interface that can handle multi-row 
 transaction.
 It will make the code cleaner if we can make doMiniBatchPut() to be covered 
 by RowProcessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

2012-03-12 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227737#comment-13227737
 ] 

Mikhail Bautin commented on HBASE-5074:
---

@Lars: what I committed was based on D1521.14.patch, but it will not be exactly 
the same patch, because I used arc patch to apply the patch from 
Differential, fixed some minor indentation problem, and committed using the 
git-svn bridge. I also re-ran all the unit tests before the commit. Sorry for a 
delay in replying.


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0, 0.96.0

 Attachments: 5074-0.94.txt, D1521.1.patch, D1521.1.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, 
 D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.14.patch, 
 D1521.14.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, 
 D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, 
 D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, 
 D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227740#comment-13227740
 ] 

stack commented on HBASE-5564:
--

The TreeSet is whats going to be used once the edits make it into the server so 
losing them in the reducer is probably optimal?  The Jesse ts++, or ts--, could 
be an option?

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-12 Thread Todd Lipcon (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227742#comment-13227742
 ] 

Todd Lipcon commented on HBASE-5564:


I think it's a feature, not a bug, that the timestamps are all identical. The 
whole point is that, in a bulk-load-only workflow, you can identify each bulk 
load exactly, and correlate it to the MR job that inserted it. If you want to 
use custom timestamps, you should specify a timestamp column in your data, or 
write your own MR job (ImportTsv is just an example which use useful for some 
cases, but for anything advanced I would expect users to write their own code)

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

[
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227744#comment-13227744
]

stack commented on HBASE-5347:
--

bq. We are not actively pursuing this approach anymore due to
non-reproducibility of GC issues.

That sounds interesting. Was it that there was no discernible difference seen
in GC managing the allocations ourselves? Would love to hear more if there are
lessons to be had Mikhail (and Prakash). Good on you lads.

GC free memory management in Level-1 Block Cache

Key: HBASE-5347
URL: https://issues.apache.org/jira/browse/HBASE-5347
Project: HBase
Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
Attachments: D1635.5.patch

On eviction of a block from the block-cache, instead of waiting for the
garbage collecter to reuse its memory, reuse the block right away.
This will require us to keep reference counts on the HFile blocks. Once we
have the reference counts in place we can do our own simple
blocks-out-of-slab allocation for the block-cache.
This will help us with
* reducing gc pressure, especially in the old generation
* making it possible to have non-java-heap memory backing the HFile blocks

[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId


[ 
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227748#comment-13227748
 ] 

stack commented on HBASE-5563:
--

Both regions were online?

Does this patch make the newest made region -- the one with the larger regionid 
-- sort AFTER the region with the lesser regionid?  I think it does but am not 
sure.

+1 on this patch...

 HRegionInfo#compareTo add the comparison of regionId
 

 Key: HBASE-5563
 URL: https://issues.apache.org/jira/browse/HBASE-5563
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5563.patch


 In the one region multi assigned case,  we could find that two regions have 
 the same table name, same startKey, same endKey, and different regionId, so 
 these two regions are same in TreeMap but different in HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId


[ 
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227749#comment-13227749
 ] 

stack commented on HBASE-5563:
--

Although, thats a bunch of failures.  Want to retry your patch against hadoopqa 
again Chunhui?  Or try the build locally.  Do you see the same amount of 
failures?

 HRegionInfo#compareTo add the comparison of regionId
 

 Key: HBASE-5563
 URL: https://issues.apache.org/jira/browse/HBASE-5563
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5563.patch


 In the one region multi assigned case,  we could find that two regions have 
 the same table name, same startKey, same endKey, and different regionId, so 
 these two regions are same in TreeMap but different in HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227765#comment-13227765
 ] 

stack commented on HBASE-5520:
--

We are reseeking to the start of the passed 'row'?  Is that what we want?  I 
thought we were trying to go to start of next row.

If reseeking the start of passed in 'row', then this patch looks fine.

I defer to Lars's opinion though.  I don't know this area of the code well.

 Support reseek() at RegionScanner
 -

 Key: HBASE-5520
 URL: https://issues.apache.org/jira/browse/HBASE-5520
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
 HBASE-5520_3.patch


 reseek() is not supported currently at the RegionScanner level. We can 
 support the same.
 This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

2012-03-12 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5542:
---

Attachment: HBASE-5542.D2217.8.patch

sc updated the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() 
and HRegion.processRow().
Reviewers: tedyu, lhofhansl, JIRA

  Make the defaut timeout infinite (no extra thread in this case)

REVISION DETAIL
  https://reviews.facebook.net/D2217

AFFECTED FILES
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessorEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessorProtocol.java
  src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRowProcessorEndpoint.java


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5560) Avoid RegionServer GC caused by timed-out calls


[ 
https://issues.apache.org/jira/browse/HBASE-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227803#comment-13227803
 ] 

stack commented on HBASE-5560:
--

@Dhruba

In trunk:

{code}
src/main/java/org/apache/hadoop/hbase/HConstants.java:  public static int 
DEFAULT_HBASE_RPC_TIMEOUT = 6;

...

In HConnectionManager.

  this.rpcTimeout = conf.getInt(
  HConstants.HBASE_RPC_TIMEOUT_KEY,
  HConstants.DEFAULT_HBASE_RPC_TIMEOUT);


{code}

 Avoid RegionServer GC caused by timed-out calls
 ---

 Key: HBASE-5560
 URL: https://issues.apache.org/jira/browse/HBASE-5560
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0, 0.96.0

 Attachments: D2241.1.patch, D2241.2.patch, D2241.3.patch


 The HBaseRpcServer queues up rpc responses if the socket connection to the 
 client is not yet ready to receive data. Calls are queued here until a 15 
 minute timeout occurs. I am able to generate a full GC when I artificially 
 make a client read rpc-responses very slowly. This jira is to make this 15 
 minute time configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227816#comment-13227816
 ] 

Phabricator commented on HBASE-5542:


tedyu has commented on the revision HBASE-5542 [jira] Unify 
HRegion.mutateRowsWithLocks() and HRegion.processRow().

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java:93 
Should read 'switch off'
  src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:51 
If positive timeout is specified, new Thread would be spawned for both cases.
  Is that desirable ?

REVISION DETAIL
  https://reviews.facebook.net/D2217


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227817#comment-13227817
]

Hadoop QA commented on HBASE-5542:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12518068/HBASE-5542.D2217.8.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -120 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 158 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1166//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1166//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1166//console

This message is automatically generated.

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch,
HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch,
HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch

[jira] [Updated] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings

2012-03-12 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5562:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk and 0.94.  Thanks for the patch David.

 test-patch.sh reports a javadoc warning when there are no new javadoc warnings
 --

 Key: HBASE-5562
 URL: https://issues.apache.org/jira/browse/HBASE-5562
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
 Fix For: 0.94.0, 0.96.0

 Attachments: test-patch.patch


 test-patch.sh will report new javadoc warnings when there are not any.  e.g.
 -1 javadoc.  The javadoc tool appears to have generated -123 warning 
 messages.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227857#comment-13227857
 ] 

Phabricator commented on HBASE-5542:


sc has commented on the revision HBASE-5542 [jira] Unify 
HRegion.mutateRowsWithLocks() and HRegion.processRow().

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:53 
Ted: there is a return statement here. I will add a comment saying short 
circuit for the nagative timeout case.
  src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java:93 
goodcatch. Thanks!

REVISION DETAIL
  https://reviews.facebook.net/D2217


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

2012-03-12 Thread Mikhail Bautin (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5566:
--

Reporter: Prakash Khemani  (was: Mikhail Bautin)

 [89-fb] Region server can get stuck getMaster on master failover
 

 Key: HBASE-5566
 URL: https://issues.apache.org/jira/browse/HBASE-5566
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Prakash Khemani
Assignee: Mikhail Bautin

 Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we 
 do not read the location of the master from ZK, so a region server can get 
 stuck there on master failover. We need to add a unit test to reliably catch 
 this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

2012-03-12 Thread Mikhail Bautin (Created) (JIRA)

[89-fb] Region server can get stuck getMaster on master failover


 Key: HBASE-5566
 URL: https://issues.apache.org/jira/browse/HBASE-5566
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin


Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we 
do not read the location of the master from ZK, so a region server can get 
stuck there on master failover. We need to add a unit test to reliably catch 
this, and fix the bug.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

2012-03-12 Thread Mikhail Bautin (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mikhail Bautin updated HBASE-5566:
--

Description:
This is specific to the 89-fb master. We have a retry loop in
HRegionServer.getMaster where we do not read the location of the master from
ZK, so a region server can get stuck there on master failover. We need to add a
unit test to reliably catch this, and fix the bug.

was:
Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we
do not read the location of the master from ZK, so a region server can get
stuck there on master failover. We need to add a unit test to reliably catch
this, and fix the bug.

[89-fb] Region server can get stuck getMaster on master failover

Key: HBASE-5566
URL: https://issues.apache.org/jira/browse/HBASE-5566
Project: HBase
Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Prakash Khemani
Assignee: Mikhail Bautin

This is specific to the 89-fb master. We have a retry loop in
HRegionServer.getMaster where we do not read the location of the master from
ZK, so a region server can get stuck there on master failover. We need to add
a unit test to reliably catch this, and fix the bug.

[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK


 [ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5206:
--

Attachment: 5206_trunk-v2.patch

With patch v2, TestDrainingServer passes.

 Port HBASE-5155 to 0.92 and TRUNK
 -

 Key: HBASE-5206
 URL: https://issues.apache.org/jira/browse/HBASE-5206
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0
Reporter: Zhihong Yu
 Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
 5206_trunk-v2.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch


 This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
 not happen parallely leading to recreation of regions that were deleted) to 
 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227869#comment-13227869
 ] 

stack commented on HBASE-4608:
--

Is HLog versioned? If not, perhaps instead of a HConstants.WAL_COMPRESSION_VER, 
add a WAL_VERSION metadata field.  Then have another for compression type (NONE 
or this)?

bq. For TestLRUDictionary, please outline the combinations that should be added.

Does it not look bare to you?   I'd think that we'd try a paragraph of text 
going in and out... perhaps test multiple dictionaries in the one file?

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

2012-03-12 Thread Mikhail Bautin (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5566:
--

Summary: [89-fb] Region server can get stuck in getMaster on master 
failover  (was: [89-fb] Region server can get stuck getMaster on master 
failover)

 [89-fb] Region server can get stuck in getMaster on master failover
 ---

 Key: HBASE-5566
 URL: https://issues.apache.org/jira/browse/HBASE-5566
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Prakash Khemani
Assignee: Mikhail Bautin

 This is specific to the 89-fb master. We have a retry loop in 
 HRegionServer.getMaster where we do not read the location of the master from 
 ZK, so a region server can get stuck there on master failover. We need to add 
 a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227871#comment-13227871
 ] 

Lars Hofhansl commented on HBASE-5074:
--

Thanks Mikhail just making sure :)

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0, 0.96.0

 Attachments: 5074-0.94.txt, D1521.1.patch, D1521.1.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, 
 D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.14.patch, 
 D1521.14.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, 
 D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, 
 D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, 
 D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()


[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227875#comment-13227875
 ] 

Phabricator commented on HBASE-5542:


tedyu has commented on the revision HBASE-5542 [jira] Unify 
HRegion.mutateRowsWithLocks() and HRegion.processRow().

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:53 
If hbase.hregion.row.processor.timeout carries positive value, new Thread 
would be spawned for mutateRowsWithLocks().
  Does this satisfy Lars' comment @ 12/Mar/12 16:33 ?

REVISION DETAIL
  https://reviews.facebook.net/D2217


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

2012-03-12 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5542:
---

Attachment: HBASE-5542.D2217.9.patch

sc updated the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() 
and HRegion.processRow().
Reviewers: tedyu, lhofhansl, JIRA

  Addressed Ted's review comments, Thanks!

REVISION DETAIL
  https://reviews.facebook.net/D2217

AFFECTED FILES
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessorEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessorProtocol.java
  src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRowProcessorEndpoint.java


 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
 

 Key: HBASE-5542
 URL: https://issues.apache.org/jira/browse/HBASE-5542
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.96.0

 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
 HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
 HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch, 
 HBASE-5542.D2217.9.patch


 mutateRowsWithLocks() does atomic mutations on multiple rows.
 processRow() does atomic read-modify-writes on a single row.
 It will be useful to generalize both and have a
 processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
 This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression

[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227892#comment-13227892
]

Ted Yu commented on HBASE-4608:
---

Introducing WAL_VERSION would imply that we may change HLog aspect other than
compression in the future.
Is there plan for the above ?
Having another compression type is nice but requires making HLogKey persistence
pluggable.

I think it would be better to introduce one meta entry instead of two.

HLog Compression

Key: HBASE-4608
URL: https://issues.apache.org/jira/browse/HBASE-4608
Project: HBase
Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
Fix For: 0.94.0

Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt,
4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt,
4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt

The current bottleneck to HBase write speed is replicating the WAL appends
across different datanodes. We can speed up this process by compressing the
HLog. Current plan involves using a dictionary to compress table name, region
id, cf name, and possibly other bits of repeated data. Also, HLog format may
be changed in other ways to produce a smaller HLog.

[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists

2012-03-12 Thread Sujee Maniyam (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujee Maniyam updated HBASE-5198:
-

Attachment: (was: PerformanceEvaluation_HBASE_5198.patch)

 PerformanceEvaluation --presplit option should not recreate TestTable if it 
 exists
 --

 Key: HBASE-5198
 URL: https://issues.apache.org/jira/browse/HBASE-5198
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Sujee Maniyam
Assignee: Sujee Maniyam
Priority: Minor
  Labels: benchmark
 Attachments: PerformanceEvaluation_HBASE_5198_2.patch


 --presplit option was added in 
 https://issues.apache.org/jira/browse/HBASE-4440 
 It  drops and re-creates TestTable if the table exists.
 This behavior in-consistent with previous ones.  If TestTable exists, it 
 shouldn't be re-created

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227898#comment-13227898
 ] 

stack commented on HBASE-4608:
--

In TestLRUDictionary, we test a single entry in essence.  We should try it w/ 
all kinds of rubbish... really long entries, empty entries, null entries 
similar entries... a dictionary for 32k worth of stuff..as we'll do in the 
wild.  So I'd think?

A test for the new class KeyValueCompression would be good to have too.


enableCompression is an odd name for this method.  Should it be 
setCompressionContext since that is what it does (you pass null if no 
compression)... seems odd passing null to 'enableCompression'

Should the Compression class in wal package have more javadoc comments 
explaining the kinda of compression it does?  Otherwise, it looks like a 
generic compressor class when in facts its a one-trick pony?

Should this method, WALCompressionEnabled, be isWALCompressionEnabled?

I like your idea of versioning the WAL

Patch is coming along nicely.  Almost there.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists

2012-03-12 Thread Sujee Maniyam (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujee Maniyam updated HBASE-5198:
-

Attachment: PerformanceEvaluation_HBASE_5198_2.patch

 PerformanceEvaluation --presplit option should not recreate TestTable if it 
 exists
 --

 Key: HBASE-5198
 URL: https://issues.apache.org/jira/browse/HBASE-5198
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Sujee Maniyam
Assignee: Sujee Maniyam
Priority: Minor
  Labels: benchmark
 Attachments: PerformanceEvaluation_HBASE_5198_2.patch


 --presplit option was added in 
 https://issues.apache.org/jira/browse/HBASE-4440 
 It  drops and re-creates TestTable if the table exists.
 This behavior in-consistent with previous ones.  If TestTable exists, it 
 shouldn't be re-created

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227900#comment-13227900
 ] 

Lars Hofhansl commented on HBASE-5520:
--

Generally it would be very useful for coprocessors to have this ability!
After looking at the discussion here (the restrictions imposed on what KVs one 
can passed - previous ones, row only, column family needs to be ignored, etc) 
we just need to be careful how we add this.

For this specific issue it would be nice to get some comparative performance 
numbers between this and filters. Maybe we're missing a level of abstraction...?


 Support reseek() at RegionScanner
 -

 Key: HBASE-5520
 URL: https://issues.apache.org/jira/browse/HBASE-5520
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
 HBASE-5520_3.patch


 reseek() is not supported currently at the RegionScanner level. We can 
 support the same.
 This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227901#comment-13227901
 ] 

Ted Yu commented on HBASE-4608:
---

bq. try a paragraph of text going in and out
LRUDictionary deals with byte array:
{code}
  public short findEntry(byte[] data, int offset, int length) {
{code}
In this regard, piping text into the dictionary is functionally same as piping 
byte[] form of integer.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists

2012-03-12 Thread Sujee Maniyam (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujee Maniyam updated HBASE-5198:
-

Status: Patch Available  (was: Open)

regenerated using --no-prefix option for git

 PerformanceEvaluation --presplit option should not recreate TestTable if it 
 exists
 --

 Key: HBASE-5198
 URL: https://issues.apache.org/jira/browse/HBASE-5198
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Sujee Maniyam
Assignee: Sujee Maniyam
Priority: Minor
  Labels: benchmark
 Attachments: PerformanceEvaluation_HBASE_5198_2.patch


 --presplit option was added in 
 https://issues.apache.org/jira/browse/HBASE-4440 
 It  drops and re-creates TestTable if the table exists.
 This behavior in-consistent with previous ones.  If TestTable exists, it 
 shouldn't be re-created

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression

[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227908#comment-13227908
]

stack commented on HBASE-4608:
--

Its the test of a single entry only which is not really exercising much.

bq. Introducing WAL_VERSION would imply that we may change HLog aspect other
than compression in the future. Is there plan for the above ?

I've not heard of any. Is that your argument for not adding a version?
Because if there has been no discussion of change up to this, we wouldn't
possibly need to change the format in the future?

HLog Compression

Key: HBASE-4608
URL: https://issues.apache.org/jira/browse/HBASE-4608
Project: HBase
Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
Fix For: 0.94.0

[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

[
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227915#comment-13227915
]

Hadoop QA commented on HBASE-5206:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518079/5206_trunk-v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 15 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 159 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.TestZooKeeper

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1167//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1167//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1167//console

This message is automatically generated.

Port HBASE-5155 to 0.92 and TRUNK
-

This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should
not happen parallely leading to recreation of regions that were deleted) to
0.92 and TRUNK

[jira] [Commented] (HBASE-5561) Create HFileSystemFactory


[ 
https://issues.apache.org/jira/browse/HBASE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227920#comment-13227920
 ] 

stack commented on HBASE-5561:
--

+1

 Create HFileSystemFactory
 -

 Key: HBASE-5561
 URL: https://issues.apache.org/jira/browse/HBASE-5561
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The HFileSystem object can be used to paper over differences in HDFS 
 versions. Create it using a factory object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5399:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I tried it locally and saw some flakeyness in the medium test runs... They 
don't seem to pass reliably anymore but its with or without this patch.  I'm 
committing to trunk before this fat patch rots.  We can come back to work on 
the flakey tests after it goes in.  Thanks Nicolas for the nice patch.  Applied 
to trunk.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227921#comment-13227921
 ] 

stack commented on HBASE-4542:
--

+1 for 0.94.  This is good stuff.

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Fix For: 0.94.0, 0.96.0

 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227923#comment-13227923
 ] 

stack commented on HBASE-4542:
--

And Lars, I think it should be safe since its on toStringery that is going on 
in this patch... no changes in Interfaces.

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Fix For: 0.94.0, 0.96.0

 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records


[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227926#comment-13227926
 ] 

Lars Hofhansl commented on HBASE-5564:
--

So this is only about ImportTsv? Should change the title in that case.

I agree with Todd, at least for ImportTsv.
Import/Export should not (and hopefully do not) exhibit this behavior (since we 
want to be able to import/export KVs with multiple versions).


 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader

 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227929#comment-13227929
 ] 

stack commented on HBASE-4608:
--

Its a regular pattern only.  Perhaps this does some decent testing?  
TestWALReplayCompressed?

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings


[ 
https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227936#comment-13227936
 ] 

Hudson commented on HBASE-5562:
---

Integrated in HBase-0.94 #26 (See 
[https://builds.apache.org/job/HBase-0.94/26/])
HBASE-5562 test-patch.sh reports a javadoc warning when there are no new 
javadoc warnings (Revision 1299825)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.94/dev-support/test-patch.sh


 test-patch.sh reports a javadoc warning when there are no new javadoc warnings
 --

 Key: HBASE-5562
 URL: https://issues.apache.org/jira/browse/HBASE-5562
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
 Fix For: 0.94.0, 0.96.0

 Attachments: test-patch.patch


 test-patch.sh will report new javadoc warnings when there are not any.  e.g.
 -1 javadoc.  The javadoc tool appears to have generated -123 warning 
 messages.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression

[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227934#comment-13227934
]

stack commented on HBASE-4608:
--

The tests do not have variety. I think we should add it here rather than wait
for the variety to hit out in the field.

bq. If only compression would evolve, I think checking against compression type
metadata would be adequate.

The above begins with a conditional, If

HLog Compression

Key: HBASE-4608
URL: https://issues.apache.org/jira/browse/HBASE-4608
Project: HBase
Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
Fix For: 0.94.0

[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check

2012-03-12 Thread David S. Wang (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David S. Wang updated HBASE-5567:
-

Attachment: HBASE_5567.patch

 test-patch.sh has logic error in findbugs check
 ---

 Key: HBASE-5567
 URL: https://issues.apache.org/jira/browse/HBASE-5567
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
  Labels: noob
 Attachments: HBASE_5567.patch


 Similar bug to HBASE-5562.  The wrong operator is used in the check against 
 $OK_FINDBUGS_WARNINGS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check

2012-03-12 Thread David S. Wang (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David S. Wang updated HBASE-5567:
-

Status: Patch Available  (was: Open)

 test-patch.sh has logic error in findbugs check
 ---

 Key: HBASE-5567
 URL: https://issues.apache.org/jira/browse/HBASE-5567
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
  Labels: noob
 Attachments: HBASE_5567.patch


 Similar bug to HBASE-5562.  The wrong operator is used in the check against 
 $OK_FINDBUGS_WARNINGS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5567) test-patch.sh has logic error in findbugs check

2012-03-12 Thread David S. Wang (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227941#comment-13227941
 ] 

David S. Wang commented on HBASE-5567:
--

Sorry I didn't catch this one earlier ... would have included it in the 
previous patch.

 test-patch.sh has logic error in findbugs check
 ---

 Key: HBASE-5567
 URL: https://issues.apache.org/jira/browse/HBASE-5567
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
  Labels: noob
 Attachments: HBASE_5567.patch


 Similar bug to HBASE-5562.  The wrong operator is used in the check against 
 $OK_FINDBUGS_WARNINGS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell

[
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227944#comment-13227944
]

stack commented on HBASE-5548:
--

Looks good.

Does stuff work as it used to after this patch goes in? Can you do the old
dumb:

get 'tablename', 'yadda', 'yadda'

still?

On get_table, should it just be table? 'status' dumps out status, 'version',
dumps out 'version'. Maybe it should stay get_table, because it returns
something you can catch into a variable?

Oh, this is nice:

{code}
+ hbase t1 = create 't1', 'f1'
{code}

How do i add a filter? Is it the same old: t.scan, {STARTROW = 'xyz'} or
something?

Patch is looking good I think. Introduces a nice bit of functionality...

Add ability to get a table in the shell
---

Key: HBASE-5548
URL: https://issues.apache.org/jira/browse/HBASE-5548
Project: HBase
Issue Type: Improvement
Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
Fix For: 0.96.0

Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch

Currently, all the commands that operate on a table in the shell first have
to take the table as name as input.
There are two main considerations:
* It is annoying to have to write the table name every time, when you should
just be able to get a reference to a table
* the current implementation is very wasteful - it creates a new HTable for
each call (but reuses the connection since it uses the same configuration)
We should be able to get a handle to a single HTable and then operate on that.

[jira] [Commented] (HBASE-4608) HLog Compression

[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227946#comment-13227946
]

Ted Yu commented on HBASE-4608:
---

I think WAL_VERSION metadata is orthogonal to compression type metadata and I
would expect both to be present in new HLog files written with this feature.
Say we define WAL_VERSION as v2 which has WAL compression capability. We still
need to check compression type metadata before applying dictionary compression.
In this regard adding WAL_VERSION seems to be redundant.

HLog Compression

Key: HBASE-4608
URL: https://issues.apache.org/jira/browse/HBASE-4608
Project: HBase
Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
Fix For: 0.94.0

[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check

2012-03-12 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5567:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed 0.94 and trunk.  Thanks for the patch David.

 test-patch.sh has logic error in findbugs check
 ---

 Key: HBASE-5567
 URL: https://issues.apache.org/jira/browse/HBASE-5567
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
  Labels: noob
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE_5567.patch


 Similar bug to HBASE-5562.  The wrong operator is used in the check against 
 $OK_FINDBUGS_WARNINGS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227954#comment-13227954
 ] 

Ted Yu commented on HBASE-4608:
---

bq. Should the Compression class in wal package ...
I only see KeyValueCompression.java under wal package. Please elaborate which 
class should carry more comments.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227961#comment-13227961
 ] 

Ted Yu commented on HBASE-4608:
---

Uploaded v23 onto review board.
After WAL version metadata design is finalized, will add that.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227964#comment-13227964
]

Phabricator commented on HBASE-5542:

lhofhansl has commented on the revision HBASE-5542 [jira] Unify
HRegion.mutateRowsWithLocks() and HRegion.processRow().

In the latest patch I can't find rowProcessorTimeout being set anywhere.

What I had in mind was this:
We are now using this for internal operations (such as mutateRow) and
external operations (those created by users). The internal operations should
not be creating new threads. External operations either could always time bound
or it could be up to the implementer.

REVISION DETAIL
https://reviews.facebook.net/D2217

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227969#comment-13227969
 ] 

stack commented on HBASE-4608:
--

bq. I think WAL_VERSION metadata is orthogonal to compression type metadata and 
I would expect both to be present in new HLog files written with this feature.

How does it get in if you don't add it?

If you don't want to add it, just don't.  I'm not going to +1 this patch though 
if it adds metadata about a new compression feature w/o introducing a general 
versioning on the WAL.

bq. Should the Compression class in wal package ...

The compression class in wal is Compressor.java.

I have trouble following your responses to my comments because they come in w/o 
context and are also they are done piecemeal which means I have to spend way 
more time than I should have to reviewing your stuff.  I'd suggest you save up 
your comments and submit them in a lump rather than hit submit per comment; 
you'll use up less internet.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell

2012-03-12 Thread Jesse Yates (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227975#comment-13227975
]

Jesse Yates commented on HBASE-5548:

bq. get 'tablename', 'yadda', 'yadda'

Yeah, nothing changes there. In fact, scan (which is the only change that patch
introduces) still will work exactly the same way. Same deal for doing something
like t.scan

bq. On get_table, should it just be table? 'status' dumps out status,
'version', dumps out 'version'. Maybe it should stay get_table, because it
returns something you can catch into a variable?

I was thinking it should stay get_table since table will be clutter the overall
naming a little (Table is also the name of the ::Hbase::Table class) and yeah,
it implies holding a reference, which _all_ the other commands don't give you.
I would imagine doing
{code}
t = table 't1'
{code}
as actually creating the table if it doesn't exist, or just getting a reference
to it (similar to how 'create' works in the latest patch). A follow-on patch
would be nice there.

I want to think about the 'right' way to do the extension of ::Hbase::Table,
rather than this piecemeal, method-by-method, decentralized increase. Also,
adding the same type of system for put, get, delete, etc would be nice. But,
yeah, close!

Add ability to get a table in the shell
---

Key: HBASE-5548
URL: https://issues.apache.org/jira/browse/HBASE-5548
Project: HBase
Issue Type: Improvement
Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
Fix For: 0.96.0

Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch

[jira] [Commented] (HBASE-5292) getsize per-CF metric incorrectly counts compaction related reads as well


[ 
https://issues.apache.org/jira/browse/HBASE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227978#comment-13227978
 ] 

Hudson commented on HBASE-5292:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
[jira] [HBASE-5292] Prevent counting getSize on compactions

Author: Zhiqiu Kong

Summary:
Added two separate metrics for both get() and next(). This is done by
refactoring on internal next() API. To be more specific, only Get.get()
and ResultScanner.next() passes the metric name (getsize and
nextsize repectively) to
  HRegion::RegionScanner::next(ListKeyValue, String)

This will eventually hit StoreScanner()::next((ListKeyValue,
int, String) where the metrics are counted.

And their call paths are:

1) Get

HTable::get(final Get get)
= HRegionServer::get(byte [] regionName, Get get)
= HRegion::get(final Get get, final Integer lockid)
= HRegion::get(final Get get)  [pass METRIC_GETSIZE to the
callee]

= HRegion::RegionScanner::next(ListKeyValue outResults, String
metric)
= HRegion::RegionScanner::next(ListKeyValue outResults, int limit,
String metric)
= HRegion::RegionScanner::nextInternal(int limit, String metric)
= KeyValueHeap::next(ListKeyValue result, int limit, String
metric)
= StoreScanner::next(ListKeyValue outResult, int limit, String
metric)

2) Next

HTable::ClientScanner::next()
= ScannerCallable::call()
= HRegionServer::next(long scannerId)
= HRegionServer::next(final long scannerId, int nbRows)  [pass
METRIC_NEXTSIZE to the callee]

= HRegion::RegionScanner::next(ListKeyValue outResults, String
metric)
= HRegion::RegionScanner::next(ListKeyValue outResults, int limit,
String metric)
= HRegion::RegionScanner::nextInternal(int limit, String metric)
= KeyValueHeap::next(ListKeyValue result, int limit, String
metric)
= StoreScanner::next(ListKeyValue outResult, int limit, String
metric)

Test Plan:
1. Passed unit tests.
2. Created a testcase TestRegionServerMetrics::testGetNextSize to
guarantee:
 * Get/Next contributes to getsize/nextsize metrics
 * Both getsize/nextsize are per Column Family
 * Flush/compaction won't affect these two metrics

Reviewed By: mbautin

Reviewers: Kannan, mbautin, Liyin, JIRA

CC: Kannan, mbautin, Liyin, zhiqiu

Differential Revision: https://reviews.facebook.net/D1617 (Revision 1299147)

 Result = FAILURE
mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java


 getsize per-CF metric incorrectly counts compaction related reads as well 
 --

 Key: HBASE-5292
 URL: https://issues.apache.org/jira/browse/HBASE-5292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
 Fix For: 0.94.0

 Attachments: 
 0001-jira-HBASE-5292-Prevent-counting-getSize-on-compacti.patch, 
 D1527.1.patch, D1527.2.patch, D1527.3.patch, D1527.4.patch, D1617.1.patch, 
 jira-HBASE-5292-Prevent-counting-getSize-on-compacti-2012-03-09_13_26_52.patch


 The per-CF getsize metric's intent was to track bytes returned (to HBase 
 clients) per-CF. [Note: We already have metrics to track # of HFileBlock's 
 read for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt 
 vs. fsblockreadcnt.]
 Currently, the getsize metric gets updated for both client initiated 
 Get/Scan operations as well for compaction related reads. The metric is 
 updated in StoreScanner.java:next() when the Scan query matcher returns an 
 INCLUDE* code via a:
  HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());
 We should not do the above in case of compactions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5535) Make the functions in task monitor synchronized


[ 
https://issues.apache.org/jira/browse/HBASE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227976#comment-13227976
 ] 

Hudson commented on HBASE-5535:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE-5535 Make the functions in task monitor synchronized (Revision 
1299088)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/monitoring/TaskMonitor.java


 Make the functions in task monitor synchronized
 ---

 Key: HBASE-5535
 URL: https://issues.apache.org/jira/browse/HBASE-5535
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: 
 HBASE-5535-Make-the-functions-in-task-monitor-synchr-2012-03-08_16_33_42.patch


 There are some potential race condition in the task monitor. So update the 
 functions in task monitor to be synchronized.
 The example of the problem caused by the race condition:
 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flush 
 failed for region 
 java.lang.IndexOutOfBoundsException: Index: 1745, Size: 1744
 at java.util.ArrayList.add(ArrayList.java:367)
 at java.util.SubList.add(AbstractList.java:633)
 at java.util.SubList.add(AbstractList.java:633)
 at java.util.SubList.add(AbstractList.java:633)
 at java.util.SubList.add(AbstractList.java:633)
 at java.util.SubList.add(AbstractList.java:633)
 at java.util.AbstractList.add(AbstractList.java:91)
 at 
 org.apache.hadoop.hbase.monitoring.TaskMonitor.createStatus(TaskMonitor.java:74)
 at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1139)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:260)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:234)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:146)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227980#comment-13227980
 ] 

Hudson commented on HBASE-4542:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
[jira] [HBASE-4542] Add filter info to slow query logging

Author: Zhiqiu Kong

Summary:
The original 89-fb diff is: https://reviews.facebook.net/D1263

Slow opertaion log does not provide enough information when a filter is
present. The followings are done to add the filter info:

1) Added toString() method for filters inheriting FilterBase, this
affect 22 filters and their subclasses. The info added includes the
filter's name and its members. For example, for TimestampsFilter, we'll
output its class name as well as the defined timestamps.

2) Added a field 'filter' in Get::toMap() and Scan::toMap() to enable
the logging of filter info.

Task ID: #750975

Blame Rev:

Test Plan:
1. Run and passed unit-tests to make sure it does not break things

2. Run kannan's script to trigger the slow operation logging, checked
for each filter to make sure the filter info was logged. To be more
detailed, the output log are as following (only 'filter' filed is put
here for ease of reading):

filter:TimestampsFilter (3/3): [2, 3, 5]
filter:TimestampsFilter (5/6): [2, 3, 5, 7, 11]
filter:ColumnPrefixFilter col2
filter:ColumnRangeFilter [col2a, col2b]
filter:ColumnCountGetFilter 8
filter:ColumnPaginationFilter (4, 4)
filter:InclusiveStopFilter row
filter:PrefixFilter row
filter:PageFilter 1
filter:SkipFilter TimestampsFilter (1/1): [1000]
filter:WhileMatchFilter TimestampsFilter (3/3): [2, 3, 5]
filter:KeyOnlyFilter
filter:FirstKeyOnlyFilter
filter:MultipleColumnPrefixFilter (3/3): [a, b, c]
filter:DependentColumnFilter (family, qualifier, true, LESS, value)
filter:FamilyFilter (LESS, value)
filter:QualifierFilter (LESS, value)
filter:RowFilter (LESS, value)
filter:ValueFilter (LESS, value)
filter:KeyOnlyFilter
filter:FirstKeyOnlyFilter
filter:SingleColumnValueFilter (family, qualifier, EQUAL, value)
filter:SingleColumnValueExcludeFilter (family, qualifier, EQUAL,
value)
filter:FilterList AND (2/2): [KeyOnlyFilter, FirstKeyOnlyFilter]
Please check ~zhiqiu/Codes/scripts/testFilter.rb for the testing script.

3. Added unit test cases to TestOperation to verify the filters'
toString() method works well.

Reviewed By: mbautin

Reviewers: Kannan, madhuvaidya, mbautin, JIRA

CC: Kannan, madhuvaidya, mbautin, zhiqiu, stack

Differential Revision: https://reviews.facebook.net/D1539 (Revision 1299019)

 Result = FAILURE
mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Get.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnCountGetFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnRangeFilter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/CompareFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/DependentColumnFilter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/InclusiveStopFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PageFilter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/TimestampsFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestOperation.java


 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Fix For: 0.94.0, 0.96.0

 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \

[jira] [Commented] (HBASE-5552) Clean up our jmx view; its a bit of a mess


[ 
https://issues.apache.org/jira/browse/HBASE-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227977#comment-13227977
 ] 

Hudson commented on HBASE-5552:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE-5552 Clean up our jmx view; its a bit of a mess (Revision 1298923)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPCStatistics.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/HBaseInfo.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Clean up our jmx view; its a bit of a mess
 --

 Key: HBASE-5552
 URL: https://issues.apache.org/jira/browse/HBASE-5552
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.92.1, 0.94.0

 Attachments: 0.92.0jmx.png, 5552.txt, currentjmxview.png, 
 patchedjmxview.png


 Fix before we release 0.92.1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5555) add a pointer to a dns verification utility in hbase book/dns


[ 
https://issues.apache.org/jira/browse/HBASE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227979#comment-13227979
 ] 

Hudson commented on HBASE-:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE- add a pointer to a dns verification utility in hbase book/dns 
(Revision 1299073)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/configuration.xml


 add a pointer to a dns verification utility in hbase book/dns
 -

 Key: HBASE-
 URL: https://issues.apache.org/jira/browse/HBASE-
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sujee Maniyam
Assignee: Sujee Maniyam
Priority: Minor
 Fix For: 0.96.0

 Attachments: .txt


 DNS should work correctly in a Hbase cluster.  I have a simple DNS checker 
 utility, that verifies DNS on all machines of the cluster. 
 https://github.com/sujee/hadoop-dns-checker
 add a pointer to the tool in hbase book : 
 http://hbase.apache.org/book.html#dns

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings


[ 
https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227981#comment-13227981
 ] 

Hudson commented on HBASE-5562:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE-5562 test-patch.sh reports a javadoc warning when there are no new 
javadoc warnings (Revision 1299824)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/dev-support/test-patch.sh


 test-patch.sh reports a javadoc warning when there are no new javadoc warnings
 --

 Key: HBASE-5562
 URL: https://issues.apache.org/jira/browse/HBASE-5562
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Trivial
 Fix For: 0.94.0, 0.96.0

 Attachments: test-patch.patch


 test-patch.sh will report new javadoc warnings when there are not any.  e.g.
 -1 javadoc.  The javadoc tool appears to have generated -123 warning 
 messages.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5538) A metric to measure the size of the response queue in the hbase rpc server


[ 
https://issues.apache.org/jira/browse/HBASE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227982#comment-13227982
 ] 

Hudson commented on HBASE-5538:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE-5538 A metric to measure the size of the response queue in the hbase 
rpc server (Revision 1299084)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java


 A metric to measure the size of the response queue in the hbase rpc server
 --

 Key: HBASE-5538
 URL: https://issues.apache.org/jira/browse/HBASE-5538
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.96.0

 Attachments: D2199.1.patch


 The HbaseServer queues responses to client (if the client is slow). Expose a 
 metric that records the size of the response queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4608) HLog Compression


[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227984#comment-13227984
 ] 

Ted Yu commented on HBASE-4608:
---

For code specific review, please use https://reviews.apache.org/r/4185/ where 
there would be context.

I can add WAL_VERSION as v2 in the metadata.
My question is: would HLog v2 be allowed not to compress Log entries ?

If desirable, we can discuss in more detail, face to face, on the 27th.

 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Fix For: 0.94.0

 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227983#comment-13227983
 ] 

Hudson commented on HBASE-5399:
---

Integrated in HBase-TRUNK #2676 (See 
[https://builds.apache.org/job/HBase-TRUNK/2676/])
HBASE-5399 Cut the link between the client and the zookeeper ensemble 
(Revision 1299872)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/MasterNotRunningException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MasterKeepAliveConnection.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperKeepAliveConnection.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Merge.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationPeer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/rest/client/TestRemoteTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java


 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227985#comment-13227985
]

Hadoop QA commented on HBASE-5542:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12518084/HBASE-5542.D2217.9.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 159 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestAtomicOperation

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1168//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1168//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1168//console

This message is automatically generated.

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

[jira] [Updated] (HBASE-5565) Refactoring doMiniBatchPut()

2012-03-12 Thread Lars Hofhansl (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5565:
-

Fix Version/s: 0.96.0

 Refactoring doMiniBatchPut()
 

 Key: HBASE-5565
 URL: https://issues.apache.org/jira/browse/HBASE-5565
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Lars Hofhansl
 Fix For: 0.96.0


 HBASE-5542 introduce a RowProcessor interface that can handle multi-row 
 transaction.
 It will make the code cleaner if we can make doMiniBatchPut() to be covered 
 by RowProcessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

[
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227988#comment-13227988
]

Phabricator commented on HBASE-5542:

sc has commented on the revision HBASE-5542 [jira] Unify
HRegion.mutateRowsWithLocks() and HRegion.processRow().

@Lars: I agree. The internal calls should never create threads for this. I
will make the change.

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:485 The
timeout is set here.

REVISION DETAIL
https://reviews.facebook.net/D2217

Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

Key: HBASE-5542
URL: https://issues.apache.org/jira/browse/HBASE-5542
Project: HBase
Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.96.0

[jira] [Reopened] (HBASE-4542) add filter info to slow query logging

2012-03-12 Thread Lars Hofhansl (Reopened) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-4542:
--


Reopening so I won't forget about the 0.94 part.

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Fix For: 0.94.0, 0.96.0

 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227991#comment-13227991
 ] 

Lars Hofhansl commented on HBASE-4542:
--

Please... Do not mark an issue as fixed if it is targeted to multiple versions 
but not all versions are committed. Pretty please? :)

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Fix For: 0.94.0, 0.96.0

 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227997#comment-13227997
 ] 

Ted Yu commented on HBASE-5399:
---

TestAtomicOperation failed in latest TRUNK build:
https://builds.apache.org/job/HBase-TRUNK/2676/testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testMultiRowMutationMultiThreads/

Similar failure shows up in the latest Hadoop QA run of HBASE-5542

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228009#comment-13228009
 ] 

Ted Yu commented on HBASE-5399:
---

From test output:
{code}
Exception in thread Thread-211 junit.framework.AssertionFailedError   at 
junit.framework.Assert.fail(Assert.java:48)
at junit.framework.Assert.fail(Assert.java:56)
at 
org.apache.hadoop.hbase.regionserver.TestAtomicOperation$2.run(TestAtomicOperation.java:392)
{code}
Here is related code in test:
{code}
  if (r.size() != 1) {
LOG.debug(r);
failures.incrementAndGet();
fail();
  }
{code}

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
 5399_inprogress.v9.patch, nochange.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5567) test-patch.sh has logic error in findbugs check