date:20120306

[
https://issues.apache.org/jira/browse/HBASE-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223076#comment-13223076
]

Hadoop QA commented on HBASE-5529:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517201/HBASE-5529-trunk.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -129 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 154 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
org.apache.hadoop.hbase.master.TestSplitLogManager

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1112//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1112//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1112//console

This message is automatically generated.

MR test failures becuase MALLOC_ARENA_MAX is not set

Key: HBASE-5529
URL: https://issues.apache.org/jira/browse/HBASE-5529
Project: HBase
Issue Type: Bug
Components: mapreduce, test
Affects Versions: 0.92.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
Attachments: HBASE-5529-to92.patch, HBASE-5529-trunk.patch

When running unit tests on CentOS 6 I get a bunch of unit test failures in
mapreduce-related tests due to:
2012-03-03 00:14:18,776 WARN [Container Monitor]
monitor.ContainersMonitorImpl$MonitoringThread(436): Container
[pid=21446,containerID=container_1330762435849_0002_01_01] is
running beyond virtual memory limits. Current usage: 223.1mb of 2.0gb
physical memory used; 6.9gb of 4.2gb virtual memory used. Killing
container.
Note: this also came up in the mapreduce project. See:
https://issues.apache.org/jira/browse/MAPREDUCE-3933
Patch coming shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread Laxman (Created) (JIRA)

Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
-

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman


Current profile is still pointing to 0.23.1-SNAPSHOT. 
This is failing to build as 23.1 is already released and snapshot is not 
available anymore.
We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223089#comment-13223089
 ] 

nkeywal commented on HBASE-5399:


TestRegionRebalancing: seems to be a flaky test. Will retry on Hadoop-QA, but I 
don't reproduce it here.
TestRegionRebalancing: With the 7s sleep (i.e. same sleep as before), I don't 
reproduce it. I will try to understand why this sleep changes the result, but 
anyway it's not a regression. 

So this patch is a good candidate for a commit I think. Further enhancement 
(clusterId, ZK watcher replacement by simple calls) could be put in another 
JIRA.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

2012-03-06 Thread Ramkrishna.S.Vasudevan (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223092#comment-13223092
]

Hadoop QA commented on HBASE-5074:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12517204/D1521.13.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 55 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -125 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 158 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.coprocessor.TestMasterObserver
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1113//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1113//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1113//console

This message is automatically generated.

support checksums in HBase block cache
--

Key: HBASE-5074
URL: https://issues.apache.org/jira/browse/HBASE-5074
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.94.0

Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch,
D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch,
D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch,
D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch,
D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch,
D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch,
D1521.8.patch, D1521.9.patch, D1521.9.patch

The current implementation of HDFS stores the data in one block file and the
metadata(checksum) in another block file. This means that every read into the
HBase block cache actually consumes two disk iops, one to the datafile and
one to the checksum file. This is a major problem for scaling HBase, because
HBase is usually bottlenecked on the number of random disk iops that the
storage-hardware offers.

[jira] [Created] (HBASE-5532) get NPE during MajorCompactionChecker

2012-03-06 Thread terry zhang (Created) (JIRA)

get NPE during MajorCompactionChecker 
--

 Key: HBASE-5532
 URL: https://issues.apache.org/jira/browse/HBASE-5532
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: terry zhang


We found error log (NullPointerException) below on our online cluster:

2012-03-05 00:17:09,592 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: 
Caught exception
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:878)
at 
org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:857)
at 
org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:3017)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1172)
at org.apache.hadoop.hbase.Chore.run(Chore.java:66)

After Check the code we found although it already check whether store files has 
null reader at the begin of the function(isMajorCompaction), but it still has 
some possibility the reader is closed before it return(eg mini compaction). So 
we need to check store file reader before we use it to avoid this NPE




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL


[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223114#comment-13223114
 ] 

Ramkrishna.S.Vasudevan commented on HBASE-5010:
---

Why HBASE-5510 updates coming in this JIRA?

Regards
Ram



 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL


 [ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5010:
--

Comment: was deleted

(was: Integrated in HBase-TRUNK-security #129 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/129/])
HBASE-5010 Pass region info in 
LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam John) 
(Revision 1297155)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
)

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL


 [ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5010:
--

Comment: was deleted

(was: Integrated in HBase-TRUNK #2672 (See 
[https://builds.apache.org/job/HBase-TRUNK/2672/])
HBASE-5010 Pass region info in 
LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam John) 
(Revision 1297155)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
)

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Open  (was: Patch Available)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Patch Available  (was: Open)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Attachment: 5399.v40.patch

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223138#comment-13223138
 ] 

Hadoop QA commented on HBASE-5399:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517212/5399.v40.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 155 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1114//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1114//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1114//console

This message is automatically generated.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:

[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5531:
--

Attachment: HBASE-5531-trunk.patch

Patch for trunk

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Attachments: HBASE-5531-trunk.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5531:
--

Fix Version/s: 0.96.0
   0.92.2
   Status: Patch Available  (was: Open)

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5531:
--

Attachment: HBASE-5531.patch

Patch for 0.92

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

[
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223190#comment-13223190
]

Hadoop QA commented on HBASE-5531:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12517222/HBASE-5531.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 javadoc. The javadoc tool appears to have generated -129 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 154 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.io.hfile.TestLruBlockCache
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1115//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1115//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1115//console

This message is automatically generated.

Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
-

Key: HBASE-5531
URL: https://issues.apache.org/jira/browse/HBASE-5531
Project: HBase
Issue Type: Bug
Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Labels: build
Fix For: 0.92.2, 0.96.0

Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch

Current profile is still pointing to 0.23.1-SNAPSHOT.
This is failing to build as 23.1 is already released and snapshot is not
available anymore.
We can update this to 0.23.2-SNAPSHOT.

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread Laxman (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223223#comment-13223223
 ] 

Laxman commented on HBASE-5531:
---

This patch involves build xml (pom.xml) changes only.
Above -1s are irrelevant to the changes.


 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223284#comment-13223284
 ] 

Zhihong Yu commented on HBASE-5531:
---

+1 on patch.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.


 [ 
https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5516:
--

Attachment: HBASE-5516_2_0.90.patch

Please review the patch and share your comments.

 GZip leading to memory leak in 0.90.  Fix similar to HBASE-5387 needed for 
 0.90.
 

 Key: HBASE-5516
 URL: https://issues.apache.org/jira/browse/HBASE-5516
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7

 Attachments: HBASE-5516_2_0.90.patch


 Usage of GZip is leading to resident memory leak in 0.90.
 We need to have something similar to HBASE-5387 in 0.90. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5532) get NPE during MajorCompactionChecker

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223291#comment-13223291
 ] 

Zhihong Yu commented on HBASE-5532:
---

Which version of HBase were you using ?

 get NPE during MajorCompactionChecker 
 --

 Key: HBASE-5532
 URL: https://issues.apache.org/jira/browse/HBASE-5532
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: terry zhang
 Attachments: HBASE-5532.patch


 We found error log (NullPointerException) below on our online cluster:
 2012-03-05 00:17:09,592 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: 
 Caught exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:878)
 at 
 org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:857)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:3017)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1172)
 at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
 After Check the code we found although it already check whether store files 
 has null reader at the begin of the function(isMajorCompaction), but it still 
 has some possibility the reader is closed before it return(eg mini 
 compaction). So we need to check store file reader before we use it to avoid 
 this NPE

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223312#comment-13223312
 ] 

ramkrishna.s.vasudevan commented on HBASE-5531:
---

@Ted
If ok i can commit this today?

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223323#comment-13223323
 ] 

Zhihong Yu commented on HBASE-5531:
---

I think so.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time


[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223343#comment-13223343
 ] 

ramkrishna.s.vasudevan commented on HBASE-5494:
---

Is this similar to 
https://issues.apache.org/jira/browse/HBASE-5373 ?

 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack

 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-5373) Table level lock to prevent the race of multiple table level operation

2012-03-06 Thread stack (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5373.
--

Resolution: Duplicate

Resolving as duplicate.  Liyin, you did it first so I should be resolving 
HBASE-5494 as a duplicate of this but HBASE-5494 has a little bit more going 
on.  Hope you don't mind.   Are you working on this?

 Table level lock to prevent the race of multiple table level operation
 --

 Key: HBASE-5373
 URL: https://issues.apache.org/jira/browse/HBASE-5373
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 A table level lock can guarantee that only one table operation would happen 
 at one time for each table. The master should require and release these table 
 locks correctly during the failover time. One proposal is to keep track of 
 the lock and its corresponding operation in the zookeeper. If there is a 
 master failover, the secondary should have a way to check whether these 
 operations are succeeded nor not before releasing the lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223350#comment-13223350
 ] 

stack commented on HBASE-5494:
--

@Ram Yes sir.  Thanks.  Resolved hbase-5373 as duplicate of this.

 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack

 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223379#comment-13223379
 ] 

ramkrishna.s.vasudevan commented on HBASE-5531:
---

To be committed to trunk, 0.94 and 0.92? Correct me if am wrong.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223401#comment-13223401
 ] 

stack commented on HBASE-5531:
--

+1 on patch and +1 on commit to all of the branches cited above.  Thanks Ram.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223422#comment-13223422
]

Phabricator commented on HBASE-5515:

sc has commented on the revision HBASE-5515 [jira] Add a processRow API that
supports atomic multiple reads and writes on a row.

Thanks for the review, guys. I will update this soon.

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java:41
I see. Now I understand what you mean. Yes, there should really be only one
class. I will make the change. Thanks!
src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java:44 That
looks better. I will make the change.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:4791 Sorry
for the mistake. I will fix it.

REVISION DETAIL
https://reviews.facebook.net/D2067

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223446#comment-13223446
 ] 

stack commented on HBASE-5436:
--

Please commit to both Lars.  Thanks.

 Right-size the map when reading attributes.
 ---

 Key: HBASE-5436
 URL: https://issues.apache.org/jira/browse/HBASE-5436
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Trivial
  Labels: performance
 Fix For: 0.94.0

 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223468#comment-13223468
 ] 

ramkrishna.s.vasudevan commented on HBASE-5531:
---

Committed to 0.92, 0.94 and trunk.
Thanks for the patch Laxman.

Thanks for the review Stack and Zhihong.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223469#comment-13223469
 ] 

ramkrishna.s.vasudevan commented on HBASE-5531:
---

How to change assignee to Laxman ?

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5531:
--

   Resolution: Fixed
Fix Version/s: 0.94.0
   Status: Resolved  (was: Patch Available)

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5528) Retry splitting log if failed in the process of ServerShutdownHandler, and abort master when retries exhausted

2012-03-06 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223475#comment-13223475
 ] 

Prakash Khemani commented on HBASE-5528:


I think the log-splitting retry logic is there in ServerShutdownHandler ...

In ServerShutdownHandler.process() ... the handler is requeued in case of error

code
 try {
if (this.shouldSplitHlog) {
  LOG.info(Splitting logs for  + serverName);
  this.services.getMasterFileSystem().splitLog(serverName);
} else {
  LOG.info(Skipping log splitting for  + serverName);
}
  } catch (IOException ioe) {
this.services.getExecutorService().submit(this);
this.deadServers.add(serverName);
throw new IOException(failed log splitting for  +
  serverName + , will retry, ioe);
  }
code

 Retry splitting log if failed in the process of ServerShutdownHandler, and 
 abort master when retries exhausted
 --

 Key: HBASE-5528
 URL: https://issues.apache.org/jira/browse/HBASE-5528
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5528.patch, hbase-5528v2.patch


 We will retry splitting log if failed in splitLogAfterStartup when master 
 starts.
 However, there is no retry for failed splitting log in the process of 
 ServerShutdownHandler.
 Also, if we finally failed to split log, we should abort master even if 
 filesystem is ok to prevent data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2012-03-06 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223490#comment-13223490
 ] 

Mikhail Bautin commented on HBASE-5010:
---

@Ram: I don't see any mentions of HBASE-5510 in this JIRA, except for your 
comment. What updates are you referring to?

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223493#comment-13223493
 ] 

stack commented on HBASE-5531:
--

I added him (See 'Administration' in JIRA.  You should have access.  Once in 
administration screens, look for people along the left.. the rest should be 
plain... bug me if you can't figure it).


 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL


[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223492#comment-13223492
 ] 

ramkrishna.s.vasudevan commented on HBASE-5010:
---

@Mikhail
The commit related updates that usually comes up once a commit is done was 
appearing in this JIRA.  But it was for HBASE-5510.  Ted removed them as it was 
not related to this JIRA.  May be that deleted part you are not able to view 
now.  :)
Sorry if the above comment had confused you.

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-5531:
-

Assignee: ramkrishna.s.vasudevan

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Assignee: ramkrishna.s.vasudevan
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223496#comment-13223496
 ] 

ramkrishna.s.vasudevan commented on HBASE-5531:
---

Got it :).. Thanks Stack.

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Assignee: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot

2012-03-06 Thread ramkrishna.s.vasudevan (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-5531:
-

Assignee: Laxman  (was: ramkrishna.s.vasudevan)

 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Assignee: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2012-03-06 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223499#comment-13223499
 ] 

Mikhail Bautin commented on HBASE-5010:
---

Actually here is the reason for those confusing updates. Ted seems to have 
specified HBASE-5010 instead of HBASE-5510 in the commit message.

commit 5d773d9fa176cb056b993fdff8a2853f75315ec8
Author: tedyu tedyu@13f79535-47bb-0310-9956-ffa450edef68
Date:   Mon Mar 5 10:41:03 2012

HBASE-5010 Pass region info in 
LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam 

git-svn-id: http://svn.apache.org/repos/asf/hbase/trunk@1297155 
13f79535-47bb-0310-9956-ffa450edef68



 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Fix For: 0.94.0

 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5533) Add more metrics to HBase

2012-03-06 Thread Shaneal Manek (Created) (JIRA)

Add more metrics to HBase
-

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor


To debub/monitor production clusters, there are some more metrics I wish I had 
available.

In particular:
- Although the average FS latencies are useful, a 'histogram' of recent 
latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
would be more useful
- Similar histograms of latencies on common operations (GET, PUT, DELETE) would 
be useful
- Counting the number of accesses to each region to detect hotspotting
- Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5533) Add more metrics to HBase


 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5533:
-

Attachment: hbase-5533-0.92.patch

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: hbase-5533-0.92.patch


 To debub/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5533) Add more metrics to HBase


 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5533:
-

Status: Patch Available  (was: Open)

Added a first pass to trunk and the 0.92 branch

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: hbase-5533-0.92.patch


 To debub/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5533) Add more metrics to HBase

[
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223516#comment-13223516
]

Hadoop QA commented on HBASE-5533:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12517268/hbase-5533-0.92.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1116//console

This message is automatically generated.

Add more metrics to HBase
-

Key: HBASE-5533
URL: https://issues.apache.org/jira/browse/HBASE-5533
Project: HBase
Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
Attachments: hbase-5533-0.92.patch

To debub/monitor production clusters, there are some more metrics I wish I
had available.
In particular:
- Although the average FS latencies are useful, a 'histogram' of recent
latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc)
would be more useful
- Similar histograms of latencies on common operations (GET, PUT, DELETE)
would be useful
- Counting the number of accesses to each region to detect hotspotting
- Exposing the current number of HLog files

[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.


[ 
https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223522#comment-13223522
 ] 

Zhihong Yu commented on HBASE-5516:
---

Can test results be described here ?
{code}
+  if (this.compressAlgo.equals(Compression.Algorithm.GZ)  blockBegin  
0) {
+blockBegin -= HEADER_SIZE;
+  }
{code}
What if blockBegin is  0 but less than HEADER_SIZE ?
{code}
+  if (compressionBos == null) {
+if (this.compressAlgo.equals(Compression.Algorithm.GZ)) {
+  createCompressionStream();
+}
+  }
{code}
The nested if statements can be condensed into one if statement.

 GZip leading to memory leak in 0.90.  Fix similar to HBASE-5387 needed for 
 0.90.
 

 Key: HBASE-5516
 URL: https://issues.apache.org/jira/browse/HBASE-5516
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7

 Attachments: HBASE-5516_2_0.90.patch


 Usage of GZip is leading to resident memory leak in 0.90.
 We need to have something similar to HBASE-5387 in 0.90. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5533) Add more metrics to HBase


[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223523#comment-13223523
 ] 

stack commented on HBASE-5533:
--

Did you mean the below:

{code}
-# hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
-# hbase.period=10
-# hbase.fileName=/tmp/metrics_hbase.log
+hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
+hbase.period=10
+hbase.fileName=/tmp/metrics_hbase.log
{code}

Will there be a bunch of contention on these additions:

{code}
+  static volatile BlockingQueueLong fsReadLatenciesNanos = new 
ArrayBlockingQueueLong(LATENCY_BUFFER_SIZE);
{code}

Could this fill the logs with thousands of repeated messages:

{code}
+  if (!stored) {
+LOG.warn(Dropping fs latency stat since buffer is full);
+  }
{code}

Could we use the cliff click counters instead of AtomicLong?  They are on the 
classpath IIRC:

{code}
+  private final MapString, AtomicLong counts;
{code}

These additions would be great to have.

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: hbase-5533-0.92.patch


 To debub/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223530#comment-13223530
 ] 

Hudson commented on HBASE-5531:
---

Integrated in HBase-0.94 #16 (See 
[https://builds.apache.org/job/HBase-0.94/16/])
HBASE-5531 Maven hadoop profile (version 23) needs to be updated with 
latest 23 snapshot (Laxman) (Revision 1297595)

 Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/branches/0.94/pom.xml


 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Assignee: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread Lars Hofhansl (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5399:
-

Fix Version/s: 0.96.0

Too big for 0.94. Agreed?

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5529) MR test failures becuase MALLOC_ARENA_MAX is not set

2012-03-06 Thread Gregory Chanan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223534#comment-13223534
 ] 

Gregory Chanan commented on HBASE-5529:
---

The 0.94 patch would be identical to trunk.  Do you want me to reattach with a 
0.94 name?

Regarding your questions, see here: 
https://issues.apache.org/jira/browse/HADOOP-7154
That is a better explanation than I gave or the linked MR jira gives.  I think 
that addresses all your questions, but let me know if you have more.

 MR test failures becuase MALLOC_ARENA_MAX is not set
 

 Key: HBASE-5529
 URL: https://issues.apache.org/jira/browse/HBASE-5529
 Project: HBase
  Issue Type: Bug
  Components: mapreduce, test
Affects Versions: 0.92.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Attachments: HBASE-5529-to92.patch, HBASE-5529-trunk.patch


 When running unit tests on CentOS 6 I get a bunch of unit test failures in 
 mapreduce-related tests due to:
 2012-03-03 00:14:18,776 WARN  [Container Monitor]
 monitor.ContainersMonitorImpl$MonitoringThread(436): Container
 [pid=21446,containerID=container_1330762435849_0002_01_01] is
 running beyond virtual memory limits. Current usage: 223.1mb of 2.0gb
 physical memory used; 6.9gb of 4.2gb virtual memory used. Killing
 container.
 Note: this also came up in the mapreduce project. See: 
 https://issues.apache.org/jira/browse/MAPREDUCE-3933
 Patch coming shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.15.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

Addressed Lars and Ted's comments

@tedyu: I changed reference count to 31 and long count to 6.

This is because we have added the following two things
long rowProcessorTimeout
RowProcessor.RowScanner rowScanner

@lhofhansl: I have made most of the changes you suggested. Thanks!

I tried to put RowProcessor into ProcessRowEndpoint.
With that approach, we will need multiple CoprocessorProtocol. One for each
implementation of RowProcessor.
Because the implementation of the RowProcessor is now passed by the protocol
class name instead of the Writable.

For example, in our unit test, we have to do
---
FriendsOfFriendsProtocol p =
table.coprocessorProxy(FriendsOfFriendsProtocol.class, row);
p.processRow();
---
And the unit test needs to provide both FriendsOfFriendsProtocol and
FriendsOfFriendsEndpoint.
I feel this is not very convenient. I still prefer to pass the RowProcessor
information via a Writable.

The other thought is that the user of RowProcessor is actually HRegion.
But ProcessRowEndpoint is used by the client via coprocessor.
I feel that it would be nice to have them separated instead of listing their
methods together.

What do you think?

REVISION DETAIL
https://reviews.facebook.net/D2067

AFFECTED FILES
src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.2.patch,
HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch,
HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch,
HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Commented] (HBASE-5526) Optional file permission settings

2012-03-06 Thread Jesse Yates (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223540#comment-13223540
 ] 

Jesse Yates commented on HBASE-5526:


Side note for other interested parties - just using dfs.umaskmode from hadoop 
isn't sufficient here as we want to protect the data files, but still see the 
directory structure. In true posix style, hdfs just applies the umask to all 
the files _and_ directories, as opposed to differentiating between the two.

 Optional file permission settings
 -

 Key: HBASE-5526
 URL: https://issues.apache.org/jira/browse/HBASE-5526
 Project: HBase
  Issue Type: New Feature
  Components: regionserver
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.0

 Attachments: java_HBASE-5526.patch


 Currently many all the files created by the HBase user are just written using 
 the default file permissions granted by hdfs. However, it is often times 
 adventageous to only allow a subset of the world to view the actual data 
 written by hbase when scanning the raw hdfs files. 
 This ticket covers setting permissions for files written to hdfs that are 
 storing actual user data, as opposed to _all_ files written to hdfs as many 
 of them contain non-identifiable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row


 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.16.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Forgot to include BaseRowProcessor.java

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessor.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, 
 HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, 
 HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, 
 HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, 
 HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, 
 HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, 
 HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central

2012-03-06 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5522:
-

Attachment: 5522.txt

 hbase 0.92 test artifacts are missing from Maven central
 

 Key: HBASE-5522
 URL: https://issues.apache.org/jira/browse/HBASE-5522
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Roman Shaposhnik
 Attachments: 5522.txt


 Could someone with enough karma, please, publish the test artifacts for 
 0.92.0?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223550#comment-13223550
]

Hadoop QA commented on HBASE-5515:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517282/HBASE-5515.D2067.15.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1117//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1117//console

This message is automatically generated.

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223564#comment-13223564
 ] 

Phabricator commented on HBASE-5074:


mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174
 What I meant by repeated blob was everything but the last four bytes. We can 
create a string constant for that part in TestHFileBlock and reuse it here.

REVISION DETAIL
  https://reviews.facebook.net/D1521

BRANCH
  svn


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, 
 D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, 
 D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, 
 D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223565#comment-13223565
 ] 

Phabricator commented on HBASE-5074:


mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174
 What I meant by repeated blob was everything but the last four bytes. We can 
create a string constant for that part in TestHFileBlock and reuse it here.

REVISION DETAIL
  https://reviews.facebook.net/D1521

BRANCH
  svn


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, 
 D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, 
 D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, 
 D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

2012-03-06 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223567#comment-13223567
 ] 

Mikhail Bautin commented on HBASE-5074:
---

@Dhruba:

could you please rerun the failed tests locally, as well as check the test 
reports?

org.apache.hadoop.hbase.coprocessor.TestMasterObserver
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestImportTsv


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, 
 D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, 
 D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, 
 D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache


[ 
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223574#comment-13223574
 ] 

Phabricator commented on HBASE-5347:


mbautin has resigned from the revision [HBASE-5347] [jira] GC free memory 
management in Level-1 Block Cache.

  We are not actively pursuing this approach anymore due to non-reproducibility 
of GC issues.

REVISION DETAIL
  https://reviews.facebook.net/D1635


 GC free memory management in Level-1 Block Cache
 

 Key: HBASE-5347
 URL: https://issues.apache.org/jira/browse/HBASE-5347
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: D1635.5.patch


 On eviction of a block from the block-cache, instead of waiting for the 
 garbage collecter to reuse its memory, reuse the block right away.
 This will require us to keep reference counts on the HFile blocks. Once we 
 have the reference counts in place we can do our own simple 
 blocks-out-of-slab allocation for the block-cache.
 This will help us with
 * reducing gc pressure, especially in the old generation
 * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

[
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223578#comment-13223578
]

Phabricator commented on HBASE-5521:

heyongqiang has commented on the revision HBASE-5521 [jira] Move
compression/decompression to an encoder specific encoding context.

INLINE COMMENTS

src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:323-327
good suggestion. i will change to use class for checking.

Yes, we need to enforce that contains right now. Basically we want each
encoder to use its own context object which is returned by
newDataBlockEncodingContext()

src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:58
will remove the original method, and change all testcases to use the new api.

src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53
are you suggesting adding a new field Encoding to each encoder?
i also think it is a good thing to do, but not sure the reason of why the
current code not doing that.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:693 Good
catch! this is a bug.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:811 good catch
here. will assign to null.
Initially i am doing that is because they are all defined as 'final'. will
remove the final.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1253 default
is used for nondata block. I can not make them singleton as they need to
maintain some reusable objects internally. By reusable objects, it is basically
some buffers allocated once and can be reused across multiple operations.

src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1327-1328
added back

src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:194-195
good catch. will add a class field to reuse it

REVISION DETAIL
https://reviews.facebook.net/D2097

BRANCH
svn

Move compression/decompression to an encoder specific encoding context
--

Key: HBASE-5521
URL: https://issues.apache.org/jira/browse/HBASE-5521
Project: HBase
Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch

As part of working on HBASE-5313, we want to add a new columnar
encoder/decoder. It makes sense to move compression to be part of
encoder/decoder:
1) a scanner for a columnar encoded block can do lazy decompression to a
specific part of a key value object
2) avoid an extra bytes copy from encoder to hblock-writer.
If there is no encoder specified for a writer, the HBlock.Writer will use a
default compression-context to do something very similar to today's code.

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

2012-03-06 Thread dhruba borthakur (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223586#comment-13223586
 ] 

dhruba borthakur commented on HBASE-5074:
-

I ran all four of them individually (manually), and all four of them pass.

Looking at the Hudson test results, it appears that all the failures are 
related to some map-reduce problem, but not really sure the precise cause. But 
I think that these failures are somehow related to this patch, especially 
because the Hudson tests for HBASE-5399 just passed successfully. Will 
investigate more (but if you have any clues, please do let me know). 

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, 
 D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, 
 D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, 
 D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223590#comment-13223590
]

Phabricator commented on HBASE-5515:

lhofhansl has commented on the revision HBASE-5515 [jira] Add a processRow API
that supports atomic multiple reads and writes on a row.

@Scott: FriendsOfFriendsProtocol would be known to the client and
FriendsOfFriendsEndpoint to the server.
You can always add a tiny client side wrapper to make it more convenient.

It's cool if you want to do it differently.
But keep in mind that a lot of effort went into getting classloading for
coprocessors right. They can be loaded dynamically from a jar on HDFS and per
table, or statically and globally for all tables.
For example you could deploy new row processing logic without touching the
servers at all (by loading a jar from HDFS).
That is actually something I do not like about filters, to use a new filter
it needs to be deployed at every regionserver and requires a rolling restart of
the cluster. Coprocessors nicely solve this problems.

With the current approach the row processing implementation would need to be
deployed outside of the coprocessor framework and always require a cluster
restart.

REVISION DETAIL
https://reviews.facebook.net/D2067

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-06 Thread Karthik Ranganathan (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5521:
---

Attachment: HBASE-5521.D2097.2.patch

heyongqiang updated the revision HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context.
Reviewers: JIRA, dhruba, tedyu, sc, mbautin

  address Ted and mbautin's comments.

REVISION DETAIL
  https://reviews.facebook.net/D2097

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java


 Move compression/decompression to an encoder specific encoding context
 --

 Key: HBASE-5521
 URL: https://issues.apache.org/jira/browse/HBASE-5521
 Project: HBase
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
 HBASE-5521.D2097.2.patch


 As part of working on HBASE-5313, we want to add a new columnar 
 encoder/decoder. It makes sense to move compression to be part of 
 encoder/decoder:
 1) a scanner for a columnar encoded block can do lazy decompression to a 
 specific part of a key value object
 2) avoid an extra bytes copy from encoder to hblock-writer. 
 If there is no encoder specified for a writer, the HBlock.Writer will use a 
 default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)

[
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223598#comment-13223598
]

Karthik Ranganathan commented on HBASE-5509:

@Zhihong Yu:
We use this code as the primary means to backup HFiles inside FB. We have done
a lot of improvements to the DFS copy underneath, and they have caused some
bugs, but thats unrelated to this code. Not too many issues, besides tuning the
number of mappers to use so that we dont overwhelm a running system.

@Lars:
You are correct about getStoreFileList() - it is passed from commandline and it
is overloaded for a subset/all CF's. Zhihong - the list versus a
comma-separated string is a trivial point since the list construction has to
happen either in the RS or in the caller, so should not make much of a
difference practically.

MR based copier for copying HFiles (trunk version)
--

Key: HBASE-5509
URL: https://issues.apache.org/jira/browse/HBASE-5509
Project: HBase
Issue Type: Sub-task
Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Lars Hofhansl
Fix For: 0.94.0, 0.96.0

Attachments: 5509-v2.txt, 5509.txt

This copier is a modification of the distcp tool in HDFS. It does the
following:
1. List out all the regions in the HBase cluster for the required table
2. Write the above out to a file
3. Each mapper
3.1 lists all the HFiles for a given region by querying the regionserver
3.2 copies all the HFiles
3.3 outputs success if the copy succeeded, failure otherwise. Failed
regions are retried in another loop
4. Mappers are placed on nodes which have maximum locality for a given region
to speed up copying

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223601#comment-13223601
]

Hadoop QA commented on HBASE-5515:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517287/HBASE-5515.D2067.16.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -128 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 155 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1118//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1118//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1118//console

This message is automatically generated.

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)


[ 
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223605#comment-13223605
 ] 

Zhihong Yu commented on HBASE-5509:
---

I agree about the point w.r.t. getStoreFileList()

 MR based copier for copying HFiles (trunk version)
 --

 Key: HBASE-5509
 URL: https://issues.apache.org/jira/browse/HBASE-5509
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, regionserver
Reporter: Karthik Ranganathan
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5509-v2.txt, 5509.txt


 This copier is a modification of the distcp tool in HDFS. It does the 
 following:
 1. List out all the regions in the HBase cluster for the required table
 2. Write the above out to a file
 3. Each mapper 
3.1 lists all the HFiles for a given region by querying the regionserver
3.2 copies all the HFiles
3.3 outputs success if the copy succeeded, failure otherwise. Failed 
 regions are retried in another loop
 4. Mappers are placed on nodes which have maximum locality for a given region 
 to speed up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223606#comment-13223606
 ] 

Zhihong Yu commented on HBASE-5399:
---

{code}
+// We put that all the possible setting to make it fails asap
{code}
should read 'put all the possible settings to make it fail'.

Please remove extra empty lines in checkHBaseAvailable().

In HConnectionManager.java, can the java imports be moved back to top of file ?
{code}
+// We have a single lock for master  zk to prevents deadlocks. Having
{code}
should read 'to prevent'.

Please use spaces around '+' below:
{code}
+  return hconnection +hashCode();
{code}

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223615#comment-13223615
 ] 

stack commented on HBASE-5399:
--

@LarsH I think its too radical a change in client behavior for 0.94.  If we 
target it for 0.96, it'll be a ripple only compared to rpc changes; it won't be 
noticed.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

[
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223616#comment-13223616
]

Phabricator commented on HBASE-5521:

mbautin has commented on the revision HBASE-5521 [jira] Move
compression/decompression to an encoder specific encoding context.

Yongqiang: thanks for addressing the comments. I will take another pass
through the new version diff.

Could you please use a consistent abbreviation for context? Either Ctx or
Cxt, but not both.

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53
HFileDataBlockEncoder is actually aware of on-disk and in-cache encoding. The
on-disk encoding may be disabled when in-cache encoding is enabled, but not the
other way. The current code apparently did not need to explicitly store
encoding type in EncodedDataBlock (which is mostly used for testing and
benchmarking), and I am wondering why you need to store it now.

REVISION DETAIL
https://reviews.facebook.net/D2097

Move compression/decompression to an encoder specific encoding context
--

[jira] [Resolved] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central

2012-03-06 Thread stack (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5522.
--

   Resolution: Fixed
Fix Version/s: 0.94.0
   0.92.1
 Assignee: stack

Confirmed that test.jar shows in repo.  A sighting happened out on the mailing 
list.  Committed this patch to 0.92, 0.94, and trunk.

 hbase 0.92 test artifacts are missing from Maven central
 

 Key: HBASE-5522
 URL: https://issues.apache.org/jira/browse/HBASE-5522
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Roman Shaposhnik
Assignee: stack
 Fix For: 0.92.1, 0.94.0

 Attachments: 5522.txt


 Could someone with enough karma, please, publish the test artifacts for 
 0.92.0?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223631#comment-13223631
 ] 

Hudson commented on HBASE-5531:
---

Integrated in HBase-0.92 #316 (See 
[https://builds.apache.org/job/HBase-0.92/316/])
HBASE-5531 Maven hadoop profile (version 23) needs to be updated with 
latest 23 snapshot (Laxman) (Revision 1297589)

 Result = FAILURE
ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/pom.xml


 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
 -

 Key: HBASE-5531
 URL: https://issues.apache.org/jira/browse/HBASE-5531
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.2
Reporter: Laxman
Assignee: Laxman
  Labels: build
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch


 Current profile is still pointing to 0.23.1-SNAPSHOT. 
 This is failing to build as 23.1 is already released and snapshot is not 
 available anymore.
 We can update this to 0.23.2-SNAPSHOT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5519) Incorrect warning in splitlogmanager

2012-03-06 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5519:
-

Attachment: 5519.92.txt

Version for 0.92 branch

 Incorrect warning in splitlogmanager
 

 Key: HBASE-5519
 URL: https://issues.apache.org/jira/browse/HBASE-5519
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
 Attachments: 
 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt


 because of recently added behavior - where the splitlogmanager timeout thread 
 get's data from zk node just to check that the zk node is there ... we might 
 have multiple watches firing without the task znode expiring.
 remove the poor warning message. (internally, there was an assert that failed 
 in Mikhail's tests)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-5519) Incorrect warning in splitlogmanager

2012-03-06 Thread stack (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5519.
--

  Resolution: Fixed
Assignee: Prakash Khemani
Hadoop Flags: Reviewed

Committed to 0.92, 0.94 and to trunk.  This is a usability thing.

 Incorrect warning in splitlogmanager
 

 Key: HBASE-5519
 URL: https://issues.apache.org/jira/browse/HBASE-5519
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: 
 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt


 because of recently added behavior - where the splitlogmanager timeout thread 
 get's data from zk node just to check that the zk node is there ... we might 
 have multiple watches firing without the task znode expiring.
 remove the poor warning message. (internally, there was an assert that failed 
 in Mikhail's tests)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223657#comment-13223657
 ] 

nkeywal commented on HBASE-5399:


@all; Ok, I will provide an updated patch with the comments taken into account. 
I also understood the issue behind TestZooKeeper. My fix currently breaks other 
unit tests, but if I manage to make it work I will include it.

For 0.94 vs. 0.96 ok as well, however, it would makes sense to deprecate some 
methods in the 0.94, especially getMaster  getZooKeeper in HConnection: people 
will get more time to react this way.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223660#comment-13223660
 ] 

stack commented on HBASE-5399:
--

@nkeywal yes, agree, good to deprecate in 0.94 rather than 0.96 so more time to 
move off the old methods

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223662#comment-13223662
 ] 

stack commented on HBASE-5399:
--

... so it seems like there needs to be a separate patch for 0.94?

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-06 Thread nkeywal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223664#comment-13223664
 ] 

nkeywal commented on HBASE-5399:


I can do a patch with only the deprecated stuff, or it can be be added in a
raw commit, as you like.

On Tue, Mar 6, 2012 at 10:03 PM, stack (Commented) (JIRA)



 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 
 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 
 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 
 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223670#comment-13223670
 ] 

Phabricator commented on HBASE-4542:


mbautin has commented on the revision [jira] [HBASE-4542] Add filter info to 
slow query logging.

  It looks like this never got committed into trunk. I will rebase the patch on 
trunk changes.

REVISION DETAIL
  https://reviews.facebook.net/D1539


 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223685#comment-13223685
 ] 

Phabricator commented on HBASE-4542:


zhiqiu has commented on the revision [jira] [HBASE-4542] Add filter info to 
slow query logging.

  @mbautin Thanks a lot!

REVISION DETAIL
  https://reviews.facebook.net/D1539


 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-5515:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517282/HBASE-5515.D2067.15.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1117//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1117//console

This message is automatically generated.)

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.17.patch, HBASE-5515.D2067.2.patch,
HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch,
HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch,
HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row


 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.17.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  @Lars:
  They can be loaded dynamically from a jar on HDFS and per table, or 
statically and globally for all tables.
  This is really cool!

  I made a sight change to HRegion.processRow() so that it takes a RowProcessor 
interface (instead of the BaseRowProcessor class)
  Now people can still define their own CorocessorProtocol that implements 
RowProcessor and have the ability to dynamically load this class as you 
described.
  I wrote a unit test TestProcessRowWithCustomMadeProtocol.java to verify that 
use case.

  But if this is cool with you, I sill like to keep the ProcessRowProtocol and 
ProcessRowEndpoint because that's a convenient entry point for most users.

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowWithCustomMadeProtocol.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, 
 HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, 
 HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, 
 HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, 
 HBASE-5515.D2067.17.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, 
 HBASE-5515.D2067.9.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223705#comment-13223705
 ] 

stack commented on HBASE-5074:
--

@Dhruba Try resubmitting your patch too.  We regularly see three of these mr 
tests fail.  Fixed in hadoop 1.0.2 apparently.

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, 
 D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, 
 D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, 
 D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central


[ 
https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223710#comment-13223710
 ] 

Hudson commented on HBASE-5522:
---

Integrated in HBase-0.94 #17 (See 
[https://builds.apache.org/job/HBase-0.94/17/])
HBASE-5522 hbase 0.92 test artifacts are missing from Maven central 
(Revision 1297682)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.94/pom.xml


 hbase 0.92 test artifacts are missing from Maven central
 

 Key: HBASE-5522
 URL: https://issues.apache.org/jira/browse/HBASE-5522
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Roman Shaposhnik
Assignee: stack
 Fix For: 0.92.1, 0.94.0

 Attachments: 5522.txt


 Could someone with enough karma, please, publish the test artifacts for 
 0.92.0?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5519) Incorrect warning in splitlogmanager


[ 
https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223711#comment-13223711
 ] 

Hudson commented on HBASE-5519:
---

Integrated in HBase-0.94 #17 (See 
[https://builds.apache.org/job/HBase-0.94/17/])
HBASE-5519 Incorrect warning in splitlogmanager (Revision 1297708)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java


 Incorrect warning in splitlogmanager
 

 Key: HBASE-5519
 URL: https://issues.apache.org/jira/browse/HBASE-5519
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: 
 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt


 because of recently added behavior - where the splitlogmanager timeout thread 
 get's data from zk node just to check that the zk node is there ... we might 
 have multiple watches firing without the task znode expiring.
 remove the poor warning message. (internally, there was an assert that failed 
 in Mikhail's tests)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5533) Add more metrics to HBase


 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5533:
-

Attachment: hbase5533-0.92-v2.patch

Re hadoop-metrics.properties: You are right - I didn't mean to include that 
change.

Re BlockingQueue contention: I don't expect much contention since there is only 
one (non-blocking) 'offer()' per FS I/O (figure, at the absolute worst with a 
large array of fast SSDs, we're doing we're still only doing a a few tens of 
thousands of IOs/second). I've attached a microbenchmark showing the current 
technique works fine for up to at least a few million offers()/second under 
heavy contention (with 99th percentile latencies under 0.001 milliseconds). If 
you still suspect this might be troublesome, I'd be happy to stripe the 
accesses to reducing contention (we don't need a strict ordering). I think 
that's probably unnecessary complexity though.

Re logging: Good point - changed it to not log more than once per 30 seconds.

Re AtomicLong counters: Didn't know about the Cliff Click couhbanters - I 
switched to using them.

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: hbase-5533-0.92.patch, hbase5533-0.92-v2.patch


 To debub/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5533) Add more metrics to HBase


 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5533:
-

Attachment: BlockingQueueContention.java

This is just the micro-benchmark to evaluate how BlockingArrayQueues perform 
under contention. It seems to be good enough for our purposes.

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: BlockingQueueContention.java, hbase-5533-0.92.patch, 
 hbase5533-0.92-v2.patch


 To debub/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context


[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223723#comment-13223723
 ] 

Phabricator commented on HBASE-5521:


heyongqiang has commented on the revision HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53 I 
think here it is DataBlockEncoder (prefix, diff etc), and not 
HFileDataBlockEncoder

  From encoding, you can get encoder. but from encoder, can not get encoding.

  The class field encoding is not needed, will remove it.

REVISION DETAIL
  https://reviews.facebook.net/D2097


 Move compression/decompression to an encoder specific encoding context
 --

 Key: HBASE-5521
 URL: https://issues.apache.org/jira/browse/HBASE-5521
 Project: HBase
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
 HBASE-5521.D2097.2.patch


 As part of working on HBASE-5313, we want to add a new columnar 
 encoder/decoder. It makes sense to move compression to be part of 
 encoder/decoder:
 1) a scanner for a columnar encoded block can do lazy decompression to a 
 specific part of a key value object
 2) avoid an extra bytes copy from encoder to hblock-writer. 
 If there is no encoder specified for a writer, the HBlock.Writer will use a 
 default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5533) Add more metrics to HBase

[
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223727#comment-13223727
]

Hadoop QA commented on HBASE-5533:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517316/BlockingQueueContention.java
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1120//console

This message is automatically generated.

Add more metrics to HBase
-

[jira] [Updated] (HBASE-4608) HLog Compression

[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-4608:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12517160/4608v17.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 9 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -127 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 155 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.regionserver.wal.TestWALReplayCompressed
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1103//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1103//console

This message is automatically generated.)

HLog Compression

Key: HBASE-4608
URL: https://issues.apache.org/jira/browse/HBASE-4608
Project: HBase
Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
Fix For: 0.94.0

Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt,
4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt,
4608v7.txt, 4608v8fixed.txt

The current bottleneck to HBase write speed is replicating the WAL appends
across different datanodes. We can speed up this process by compressing the
HLog. Current plan involves using a dictionary to compress table name, region
id, cf name, and possibly other bits of repeated data. Also, HLog format may
be changed in other ways to produce a smaller HLog.

[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context


 [ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5521:
---

Attachment: HBASE-5521.D2097.3.patch

heyongqiang updated the revision HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context.
Reviewers: JIRA, dhruba, tedyu, sc, mbautin

  add mbautin's comments, and changed all Cxt to Ctx

REVISION DETAIL
  https://reviews.facebook.net/D2097

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java


 Move compression/decompression to an encoder specific encoding context
 --

 Key: HBASE-5521
 URL: https://issues.apache.org/jira/browse/HBASE-5521
 Project: HBase
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
 HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch


 As part of working on HBASE-5313, we want to add a new columnar 
 encoder/decoder. It makes sense to move compression to be part of 
 encoder/decoder:
 1) a scanner for a columnar encoded block can do lazy decompression to a 
 specific part of a key value object
 2) avoid an extra bytes copy from encoder to hblock-writer. 
 If there is no encoder specified for a writer, the HBlock.Writer will use a 
 default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4542) add filter info to slow query logging

2012-03-06 Thread Mikhail Bautin (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4542:
--

Attachment: Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch

Rebasing patch on trunk

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context


 [ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5521:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Move compression/decompression to an encoder specific encoding context
 --

 Key: HBASE-5521
 URL: https://issues.apache.org/jira/browse/HBASE-5521
 Project: HBase
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
 HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch


 As part of working on HBASE-5313, we want to add a new columnar 
 encoder/decoder. It makes sense to move compression to be part of 
 encoder/decoder:
 1) a scanner for a columnar encoded block can do lazy decompression to a 
 specific part of a key value object
 2) avoid an extra bytes copy from encoder to hblock-writer. 
 If there is no encoder specified for a writer, the HBlock.Writer will use a 
 default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row


 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.18.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Remove some repeated codes in unit tests

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowWithCustomMadeProtocol.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, 
 HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, 
 HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, 
 HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, 
 HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch, 
 HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, 
 HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, 
 HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223750#comment-13223750
]

Hadoop QA commented on HBASE-5515:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517313/HBASE-5515.D2067.17.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -129 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 155 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1119//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1119//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1119//console

This message is automatically generated.

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Commented] (HBASE-4542) add filter info to slow query logging

2012-03-06 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223755#comment-13223755
 ] 

Mikhail Bautin commented on HBASE-4542:
---

All unit tests passed locally. Waiting for Hadoop QA before committing.

 add filter info to slow query logging
 -

 Key: HBASE-4542
 URL: https://issues.apache.org/jira/browse/HBASE-4542
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
 Attachments: 
 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, 
 Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, 
 D1263.2.patch, D1539.1.patch


 Slow query log doesn't report filters in effect.
 For example:
 {code}
 (operationTooSlow): \
 {processingtimems:3468,client:10.138.43.206:40035,timeRange: 
 [0,9223372036854775807],\
 starttimems:1317772005821,responsesize:42411, \
 class:HRegionServer,table:myTable,families:{CF1:ALL]},\
 row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
 method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
 {code}
 the above would suggest that all columns of myTable:CF1 are being requested 
 for the given row. But in reality there could be filters in effect (such as 
 ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should 
 enhance the slow query log to capture  report this information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5213) hbase master stop does not bring down backup masters

2012-03-06 Thread Gregory Chanan (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan reassigned HBASE-5213:
-

Assignee: Gregory Chanan

 hbase master stop does not bring down backup masters
 --

 Key: HBASE-5213
 URL: https://issues.apache.org/jira/browse/HBASE-5213
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor

 Typing hbase master stop produces the following message:
 stop   Start cluster shutdown; Master signals RegionServer shutdown
 It seems like backup masters should be considered part of the cluster, but 
 they are not brought down by hbase master stop.
 stop-hbase.sh does correctly bring down the backup masters.
 The same behavior is observed when a client app makes use of the client API 
 HBaseAdmin.shutdown() 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown()
  -- this isn't too surprising since I think hbase master stop just calls 
 this API.
 It seems like HBASE-1448 address this; perhaps there was a regression?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4542) add filter info to slow query logging

[
https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223785#comment-13223785
]

Hadoop QA commented on HBASE-4542:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517322/Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -129 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 154 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1121//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1121//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1121//console

This message is automatically generated.

add filter info to slow query logging
-

Key: HBASE-4542
URL: https://issues.apache.org/jira/browse/HBASE-4542
Project: HBase
Issue Type: Improvement
Affects Versions: 0.89.20100924
Reporter: Kannan Muthukkaruppan
Assignee: Madhuwanti Vaidya
Attachments:
0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch,
Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch,
D1263.2.patch, D1539.1.patch

Slow query log doesn't report filters in effect.
For example:
{code}
(operationTooSlow): \
{processingtimems:3468,client:10.138.43.206:40035,timeRange:
[0,9223372036854775807],\
starttimems:1317772005821,responsesize:42411, \
class:HRegionServer,table:myTable,families:{CF1:ALL]},\
row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\
method:get,totalColumns:1,maxVersions:1,storeLimit:-1}
{code}
the above would suggest that all columns of myTable:CF1 are being requested
for the given row. But in reality there could be filters in effect (such as
ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should
enhance the slow query log to capture report this information.

[jira] [Created] (HBASE-5534) HBase shell's return value is almost always 0

2012-03-06 Thread Alex Newman (Created) (JIRA)

HBase shell's return value is almost always 0
-

 Key: HBASE-5534
 URL: https://issues.apache.org/jira/browse/HBASE-5534
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman


So I was trying to write some simple scripts to verify client connections to 
HBase using the shell and I noticed that the HBase shell always returns 0 even 
when it can't connect to an HBase server. I'm not sure if this is the best 
option. What would be neat is if you had some capability to run commands like

hbase shell --command='disable table;\ndrop table;' and it would error out if 
any of the commands fail to succeed. echo disable table | hbase shell could 
continue to work as it does now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

[
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223795#comment-13223795
]

Hadoop QA commented on HBASE-5521:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517319/HBASE-5521.D2097.3.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -125 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 158 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1122//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1122//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1122//console

This message is automatically generated.

Move compression/decompression to an encoder specific encoding context
--

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

[
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223800#comment-13223800
]

Hadoop QA commented on HBASE-5515:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12517326/HBASE-5515.D2067.18.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -129 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 155 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1123//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1123//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1123//console

This message is automatically generated.

Add a processRow API that supports atomic multiple reads and writes on a row

Key: HBASE-5515
URL: https://issues.apache.org/jira/browse/HBASE-5515
Project: HBase
Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch,
HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch,
HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch,
HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch,
HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch,
HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch,
HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch,
HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch

We have modified HRegion.java internally to do some atomic row processing. It
will be nice to have a plugable API for this.

[jira] [Updated] (HBASE-4608) HLog Compression