date:20120521


 [ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6059:


Attachment: HBASE-6059-testcase.patch

I have written the test case to reproduce the issue

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again


 [ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6059:


Attachment: HBASE-6059.patch

In the solution patch,  I use Mapbyte[], Long maxSeqIdInStores to save each 
store's maxSeqId,
So, when replaying edit logs, we skip the edits for different stores accoring 
to its own maxSeqId

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()

[
https://issues.apache.org/jira/browse/HBASE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280038#comment-13280038
]

Hadoop QA commented on HBASE-6049:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528398/HBASE-6049-v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.io.TestHbaseObjectWritable

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1943//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1943//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1943//console

This message is automatically generated.

Serializing List containing null elements will cause NullPointerException
in HbaseObjectWritable.writeObject()

Key: HBASE-6049
URL: https://issues.apache.org/jira/browse/HBASE-6049
Project: HBase
Issue Type: Bug
Components: io
Affects Versions: 0.94.0
Reporter: Maryann Xue
Attachments: HBASE-6049-v2.patch, HBASE-6049.patch

An error case could be in Coprocessor AggregationClient, the median()
function handles an empty region and returns a List Object with the first
element as a Null value. NPE occurs in the RPC response stage and the
response never gets sent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280077#comment-13280077
 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
---

@Chunhui
This is a damn good one.  But still i find one problem is there in this.  A 
similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i 
delete the row 'r2' also
{code}
del = new Delete(Bytes.toBytes(r));
htable.delete(del);
resultScanner = htable.getScanner(new Scan());
count = 0;
while (resultScanner.next() != null) {
  count++;
}
{code}
Now my seq id from the store files will be 0 only as nothing to get after major 
compaction. So still the same problem is occuring.  I tried to simulate this 
with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a 
major compaction? Because as i see this problem that without major compaction 
there is no issue at all.

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-6059) Replaying recovered edits would make deleted data exist again


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280077#comment-13280077
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-6059 at 5/21/12 10:59 AM:
-

@Chunhui
This is a damn good one.  But still i find one problem is there in this.  A 
similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i 
delete the row 'r2' also
{edit}
In the same test case in the place where you are deleting the row 'r1' if i 
delete the row 'r' also
{edit}
{code}
del = new Delete(Bytes.toBytes(r));
htable.delete(del);
resultScanner = htable.getScanner(new Scan());
count = 0;
while (resultScanner.next() != null) {
  count++;
}
{code}
Now my seq id from the store files will be 0 only as nothing to get after major 
compaction. So still the same problem is occuring.  I tried to simulate this 
with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a 
major compaction? Because as i see this problem that without major compaction 
there is no issue at all.

  was (Author: ram_krish):
@Chunhui
This is a damn good one.  But still i find one problem is there in this.  A 
similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i 
delete the row 'r2' also
{code}
del = new Delete(Bytes.toBytes(r));
htable.delete(del);
resultScanner = htable.getScanner(new Scan());
count = 0;
while (resultScanner.next() != null) {
  count++;
}
{code}
Now my seq id from the store files will be 0 only as nothing to get after major 
compaction. So still the same problem is occuring.  I tried to simulate this 
with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a 
major compaction? Because as i see this problem that without major compaction 
there is no issue at all.
  
 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

[
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280101#comment-13280101
]

chunhui shen commented on HBASE-6059:
-

@ram
Yes, I have also considered that all the entries in the store file is deleted
and we don't write any new store file.
But, could we generate one empty store file with its meta data alone? Let me do
a try first.

Replaying recovered edits would make deleted data exist again
-

Key: HBASE-6059
URL: https://issues.apache.org/jira/browse/HBASE-6059
Project: HBase
Issue Type: Bug
Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch

When we replay recovered edits, we used the minSeqId of Store, It may cause
deleted data appeared again.
Let's see how it happens. Suppose the region with two families(cf1,cf2)
1.put one data to the region (put r1,cf1:q1,v1)
2.move the region from server A to server B.
3.delete the data put by step 1(delete r1)
4.flush this region.
5.make major compaction for this region
6.move the region from server B to server A.
7.Abort server A
8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
(When we replay recovered edits, we used the minSeqId of Store, because cf2
has no store files, so its seqId is 0, so the edit log of put data will be
replayed to the region)

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280157#comment-13280157
 ] 

Zhihong Yu commented on HBASE-5757:
---

@Jan:
Neither patch applies to trunk as of today.
Can you attach patch for trunk and name it accordingly ?

Thanks

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
 Attachments: HBASE-5757.patch, HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5757) TableInputFormat should handle as many errors as possible

2012-05-21 Thread Jan Lukavsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Lukavsky updated HBASE-5757:


Attachment: HBASE-5757-trunk-r1341041.patch

There was conflicting commit to patch for HBASE-6004. Merged this patch, the 
new one should apply to revision 1341041.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
 Attachments: HBASE-5757-trunk-r1341041.patch, HBASE-5757.patch, 
 HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5757) TableInputFormat should handle as many errors as possible


 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5757:
--

Status: Patch Available  (was: Open)

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
 Attachments: HBASE-5757-trunk-r1341041.patch, HBASE-5757.patch, 
 HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible

[
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280240#comment-13280240
]

Hadoop QA commented on HBASE-5757:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12528434/HBASE-5757-trunk-r1341041.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.coprocessor.TestClassLoading
org.apache.hadoop.hbase.replication.TestReplication

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1944//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1944//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1944//console

This message is automatically generated.

TableInputFormat should handle as many errors as possible
-

Key: HBASE-5757
URL: https://issues.apache.org/jira/browse/HBASE-5757
Project: HBase
Issue Type: Bug
Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Attachments: HBASE-5757-trunk-r1341041.patch, HBASE-5757.patch,
HBASE-5757.patch

Prior to HBASE-4196 there was different handling of IOExceptions thrown from
scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this
handling so that if exception is caught a reconnect is attempted (without
bothering the mapred client). After that, HBASE-4269 changed this behavior
back, but in both mapred and mapreduce APIs. The question is, is there any
reason not to handle all errors that the input format can handle? In other
words, why not try to reissue the request after *any* IOException? I see the
following disadvantages of current approach
* the client may see exceptions like LeaseException and
ScannerTimeoutException if he fails to process all fetched data in timeout
* to avoid ScannerTimeoutException the client must raise
hbase.regionserver.lease.period
* timeouts for tasks is aready configured in mapred.task.timeout, so this
seems to me a bit redundant, because typically one needs to update both these
parameters
* I don't see any possibility to get rid of LeaseException (this is
configured on server side)
I think all of these issues would be gone, if the DoNotRetryIOException would
not be rethrown. -On the other hand, handling errors in InputFormat has
disadvantage, that it may hide from the user some inefficiency. Eg. if I have
very big scanner.caching, and I manage to process only a few rows in timeout,
I will end up with single row being fetched many times (and will not be
explicitly notified about this). Could we solve this problem by adding some
counter to the InputFormat?-

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280249#comment-13280249
 ] 

Zhihong Yu commented on HBASE-5757:
---

I ran the following two tests and they passed with the latest patch:
{code}
  518  mt -Dtest=TestClassLoading
  519  mt -Dtest=TestSplitTransactionOnCluster
{code}
The replication tests have been failing and are not related to this change.

Minor comments:
{code}
+// try to handle exceptions all possible exceptions by restarting
{code}
The first 'exceptions ' should be removed.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
 Attachments: HBASE-5757-trunk-r1341041.patch, HBASE-5757.patch, 
 HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor


 [ 
https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5882.
---

Resolution: Fixed

Committed the patch. Hence resolving this.

 Prcoess RIT on master restart can try assigning the region if the region is 
 found on a dead server instead of waiting for Timeout Monitor
 -

 Key: HBASE-5882
 URL: https://issues.apache.org/jira/browse/HBASE-5882
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: Ashutosh Jindal
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5882_v5.patch, HBASE-5882_v6.patch, 
 hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch, 
 hbase_5882_V4.patch


 Currently on  master restart if it tries to do processRIT, any region if 
 found on dead server tries to avoid the nwe assignment so that timeout 
 monitor can take care.
 This case is more prominent if the node is found in RS_ZK_REGION_OPENING 
 state. I think we can handle this by triggering a new assignment with a new 
 plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible

[
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280285#comment-13280285
]

Hadoop QA commented on HBASE-5757:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528448/5757-trunk-v2.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.coprocessor.TestClassLoading
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.regionserver.wal.TestHLog
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1945//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1945//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1945//console

This message is automatically generated.

TableInputFormat should handle as many errors as possible
-

Key: HBASE-5757
URL: https://issues.apache.org/jira/browse/HBASE-5757
Project: HBase
Issue Type: Bug
Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch,
HBASE-5757.patch, HBASE-5757.patch

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280295#comment-13280295
 ] 

Zhihong Yu commented on HBASE-6059:
---

If majorCompaction is false, we still need to check !kvs.isEmpty(), right ?

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-05-21 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280311#comment-13280311
 ] 

Jean-Daniel Cryans commented on HBASE-5778:
---

I don't see how in theory the seek can be a problem when tail'ing a log from 
the start since we read the whole file. The only case where it will need to be 
handled differently is when a region server needs to replicate a log that 
another RS started working on but died. In that case we can just read the file 
up to the last seek position but don't replicate anything.

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Lars Hofhansl
Priority: Blocker
 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible

[
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280315#comment-13280315
]

Jonathan Hsieh commented on HBASE-5757:
---

Zhihong, thanks for pinging me about this. Jan, thanks for being patient with
me on this.

The changes look good. Patch applies to 0.94 and trunk. I believe the request
was for getting this into 0.90 -- I'll look into backporting this behavior back
into that version.

TableInputFormat should handle as many errors as possible
-

Key: HBASE-5757
URL: https://issues.apache.org/jira/browse/HBASE-5757
Project: HBase
Issue Type: Bug
Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch,
HBASE-5757.patch, HBASE-5757.patch

[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor


[ 
https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280316#comment-13280316
 ] 

Hudson commented on HBASE-5882:
---

Integrated in HBase-TRUNK #2910 (See 
[https://builds.apache.org/job/HBase-TRUNK/2910/])
HBASE-5882 Prcoess RIT on master restart can try assigning the region if 
the region is found on a dead server instead of waiting for Timeout Monitor 
(Ashutosh) (Revision 1341110)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


 Prcoess RIT on master restart can try assigning the region if the region is 
 found on a dead server instead of waiting for Timeout Monitor
 -

 Key: HBASE-5882
 URL: https://issues.apache.org/jira/browse/HBASE-5882
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: Ashutosh Jindal
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5882_v5.patch, HBASE-5882_v6.patch, 
 hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch, 
 hbase_5882_V4.patch


 Currently on  master restart if it tries to do processRIT, any region if 
 found on dead server tries to avoid the nwe assignment so that timeout 
 monitor can take care.
 This case is more prominent if the node is found in RS_ZK_REGION_OPENING 
 state. I think we can handle this by triggering a new assignment with a new 
 plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5757) TableInputFormat should handle as many errors as possible


 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh reassigned HBASE-5757:
-

Assignee: Jan Lukavsky

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280318#comment-13280318
 ] 

Zhihong Yu commented on HBASE-5757:
---

TestHLog failure was caused by:
{code}
java.net.BindException: Problem binding to localhost/127.0.0.1:41331 : Address 
already in use
at org.apache.hadoop.ipc.Server.bind(Server.java:227)
at org.apache.hadoop.ipc.Server$Listener.init(Server.java:301)
{code}
I ran it locally and it passed.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280328#comment-13280328
 ] 

Hudson commented on HBASE-5757:
---

Integrated in HBase-TRUNK #2911 (See 
[https://builds.apache.org/job/HBase-TRUNK/2911/])
HBASE-5757 TableInputFormat should handle as many errors as possible (Jan 
Lukavsky) (Revision 1341132)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java


 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-05-21 Thread Enis Soztutar (JIRA)

Enis Soztutar created HBASE-6060:


 Summary: Regions's in OPENING state from failed regionservers 
takes a long time to recover
 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar


we have seen a pattern in tests, that the regions are stuck in OPENING state 
for a very long time when the region server who is opening the region fails. My 
understanding of the process: 
 
 - master calls rs to open the region. If rs is offline, a new plan is 
generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
HMaster.assign()
 - RegionServer, starts opening a region, changes the state in znode. But that 
znode is not ephemeral. (see ZkAssign)
 - Rs transitions zk node from OFFLINE to OPENING. See 
OpenRegionHandler.process()
 - rs then opens the region, and changes znode from OPENING to OPENED
 - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
state, and the master just waits for rs to change the region state, but since 
rs is down, that wont happen. 
 - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
against these kind of conditions. It periodically checks (every 10 sec by 
default) the regions in transition to see whether they timedout 
(hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
which explains what you and I are seeing. 
 - ServerShutdownHandler in Master does not reassign regions in OPENING state, 
although it handles other states. 

Lowering that threshold from the configuration is one option, but still I think 
we can do better. 

Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5970) Improve the AssignmentManager#updateTimer and speed up handling opened event


[ 
https://issues.apache.org/jira/browse/HBASE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280367#comment-13280367
 ] 

nkeywal commented on HBASE-5970:


Hi,

Could you share the logs of the tests? I would be interested to have a look at 
them.
The javadoc for updateTimers says it's not used for bulk assignment, is there a 
mix of regions 'bulk assigned' and other regions?
I see as well in the description that the time was once with 
'retainAssignment=true' and once without. Are the results comparable in both 
cases?

Thank you!

 Improve the AssignmentManager#updateTimer and speed up handling opened event
 

 Key: HBASE-5970
 URL: https://issues.apache.org/jira/browse/HBASE-5970
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: 5970v3.patch, HBASE-5970.patch, HBASE-5970v2.patch, 
 HBASE-5970v3.patch


 We found handing opened event very slow in the environment with lots of 
 regions.
 The problem is the slow AssignmentManager#updateTimer.
 We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process 
 of bulk assigning took 1 hours.
 2012-05-06 20:31:49,201 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 
 region(s) round-robin across 5 server(s)
 2012-05-06 21:26:32,103 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
 I think we could do the improvement for the AssignmentManager#updateTimer: 
 Make a thread do this work.
 After the improvement, it took only 4.5mins
 2012-05-07 11:03:36,581 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 10 
 region(s) across 5 server(s), retainAssignment=true 
 2012-05-07 11:07:57,073 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280385#comment-13280385
 ] 

Hudson commented on HBASE-5757:
---

Integrated in HBase-0.94 #205 (See 
[https://builds.apache.org/job/HBase-0.94/205/])
HBASE-5757 TableInputFormat should handle as many errors as possible (Jan 
Lukavsky) (Revision 1341133)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java


 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1749) If RS looses lease, we used to restart by default; reinstitute


[ 
https://issues.apache.org/jira/browse/HBASE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280387#comment-13280387
 ] 

nkeywal commented on HBASE-1749:


Yes, because of HBASE-5844  HBASE-5939, we now:
- delete immediately the znode when we exit
- restart after a non planned stop.

This is safer than retrying to reinstitute a region server in the same jvm, as 
it removes any memory or static variable effect. In both case we trigger a 
reassignment of the regions however.


 If RS looses lease, we used to restart by default; reinstitute
 --

 Key: HBASE-1749
 URL: https://issues.apache.org/jira/browse/HBASE-1749
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: nkeywal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HBASE-1749) If RS looses lease, we used to restart by default; reinstitute


 [ 
https://issues.apache.org/jira/browse/HBASE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-1749 started by nkeywal.

 If RS looses lease, we used to restart by default; reinstitute
 --

 Key: HBASE-1749
 URL: https://issues.apache.org/jira/browse/HBASE-1749
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: nkeywal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-1749) If RS looses lease, we used to restart by default; reinstitute


 [ 
https://issues.apache.org/jira/browse/HBASE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-1749.


Resolution: Duplicate

 If RS looses lease, we used to restart by default; reinstitute
 --

 Key: HBASE-1749
 URL: https://issues.apache.org/jira/browse/HBASE-1749
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: nkeywal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5757) TableInputFormat should handle as many errors as possible


 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5757:
--

Attachment: hbase-5757-92.patch

hbase-5757-92.patch is for 0.92 and 0.90 versions.  Underlaying metrics have 
changed so it does not update metrics like in 0.94 or trunk/0.96.  It does 
however include the updated tests that demonstrated updated semantics.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280398#comment-13280398
 ] 

Jonathan Hsieh commented on HBASE-5757:
---

Zhihong, Jan, if the 0.92/0.90 versions looks good to you I will commit.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible

[
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280412#comment-13280412
]

Hadoop QA commented on HBASE-5757:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528472/hbase-5757-92.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1946//console

This message is automatically generated.

TableInputFormat should handle as many errors as possible
-

Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch,
HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch

[jira] [Created] (HBASE-6061) Fix ACL Admin Table inconsistent permission check

Matteo Bertozzi created HBASE-6061:
--

 Summary: Fix ACL Admin Table inconsistent permission check
 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.94.0, 0.92.1, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.92.2, 0.96.0, 0.94.1


the requirePermission() check for admin operation on a table is currently 
inconsistent.

Table Owner with CREATE rights (that means, the owner has created that table) 
can enable/disable and delete the table but needs ADMIN rights to 
add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


 [ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6061:
---

Attachment: HBASE-6061-v0.patch

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280433#comment-13280433
 ] 

Zhihong Yu commented on HBASE-5757:
---

TestTableInputFormat passed in 0.92 with 0.92 patch.

+1 from me.

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6036) Add Cluster-level PB-based calls to HMasterInterface (minus file-format related calls)

2012-05-21 Thread Gregory Chanan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280436#comment-13280436
 ] 

Gregory Chanan commented on HBASE-6036:
---

These replication tests fail even without this patch applied, so I think this 
is good to go.

 Add Cluster-level PB-based calls to HMasterInterface (minus file-format 
 related calls)
 --

 Key: HBASE-6036
 URL: https://issues.apache.org/jira/browse/HBASE-6036
 Project: HBase
  Issue Type: Task
  Components: ipc, master, migration
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-6036-v2.patch, HBASE-6036.patch


 This should be a subtask of HBASE-5445, but since that is a subtask, I can't 
 also make this a subtask (apparently).
 Convert the cluster-level calls that do not touch the file-format related 
 calls (see HBASE-5453).  These are:
 IsMasterRunning
 Shutdown
 StopMaster
 Balance
 LoadBalancerIs (was synchronousBalanceSwitch/balanceSwitch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6043) Add Increment Coalescing in thrift.


[ 
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280440#comment-13280440
 ] 

Elliott Clark commented on HBASE-6043:
--

Not sure why Phabricator isn't posting diffs but the review is up at 
https://reviews.facebook.net/D3315.

 Add Increment Coalescing in thrift.
 ---

 Key: HBASE-6043
 URL: https://issues.apache.org/jira/browse/HBASE-6043
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark

 Since the thrift server uses the client api reducing the number of rpc's 
 greatly speeds up increments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6043) Add Increment Coalescing in thrift.


 [ 
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6043:
-

Attachment: HBASE-6043-0.patch

 Add Increment Coalescing in thrift.
 ---

 Key: HBASE-6043
 URL: https://issues.apache.org/jira/browse/HBASE-6043
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: HBASE-6043-0.patch


 Since the thrift server uses the client api reducing the number of rpc's 
 greatly speeds up increments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6043) Add Increment Coalescing in thrift.


 [ 
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6043:
-

Status: Patch Available  (was: Open)

 Add Increment Coalescing in thrift.
 ---

 Key: HBASE-6043
 URL: https://issues.apache.org/jira/browse/HBASE-6043
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: HBASE-6043-0.patch


 Since the thrift server uses the client api reducing the number of rpc's 
 greatly speeds up increments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280442#comment-13280442
 ] 

Zhihong Yu commented on HBASE-6061:
---

Minor comment:
{code}
+   * If current user is the table owner, and has CREATE permission is a table 
admin,
{code}
', and has CREATE permission is a table admin' - ' and has CREATE permission, 
then he/she has table admin permission.' (wrap if line is too long)

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280448#comment-13280448
 ] 

Andrew Purtell commented on HBASE-6061:
---

+1 yes, this is better, since the direction here is to let the creator take any 
action on the table, pulling up the logic to a small helper method is cleaner, 
fixes the issue, and will avoid error going forward.

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

[
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280453#comment-13280453
]

Andrew Purtell commented on HBASE-6060:
---

The TimeoutMonitor timeout was increased to 30 minutes in HBASE-4126.

Regions's in OPENING state from failed regionservers takes a long time to
recover
-

Key: HBASE-6060
URL: https://issues.apache.org/jira/browse/HBASE-6060
Project: HBase
Issue Type: Bug
Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar

we have seen a pattern in tests, that the regions are stuck in OPENING state
for a very long time when the region server who is opening the region fails.
My understanding of the process:

- master calls rs to open the region. If rs is offline, a new plan is
generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in
master memory, zk still shows OFFLINE). See HRegionServer.openRegion(),
HMaster.assign()
- RegionServer, starts opening a region, changes the state in znode. But
that znode is not ephemeral. (see ZkAssign)
- Rs transitions zk node from OFFLINE to OPENING. See
OpenRegionHandler.process()
- rs then opens the region, and changes znode from OPENING to OPENED
- when rs is killed between OPENING and OPENED states, then zk shows OPENING
state, and the master just waits for rs to change the region state, but since
rs is down, that wont happen.
- There is a AssignmentManager.TimeoutMonitor, which does exactly guard
against these kind of conditions. It periodically checks (every 10 sec by
default) the regions in transition to see whether they timedout
(hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min,
which explains what you and I are seeing.
- ServerShutdownHandler in Master does not reassign regions in OPENING
state, although it handles other states.
Lowering that threshold from the configuration is one option, but still I
think we can do better.
Will investigate more.

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280459#comment-13280459
 ] 

Matteo Bertozzi commented on HBASE-6061:


Not related but maybe we can squeeze into this one... preCheckAndPut() and 
preCheckAndDelete() checks for READ when they also want to WRITE... 
for me checking for WRITE permission is the right thing... what do you say 
about that? keep READ, replace with WRITE.. open new jira?

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


 [ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6061:
---

Attachment: HBASE-6061-v1.patch

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


 [ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6061:
---

Attachment: (was: HBASE-6061-v1.patch)

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


 [ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6061:
---

Attachment: HBASE-6061-v1.patch

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280466#comment-13280466
 ] 

Andrew Purtell commented on HBASE-6061:
---

bq. Not related but maybe we can squeeze into this one... preCheckAndPut() and 
preCheckAndDelete() checks for READ when they also want to WRITE

Yes, new jira.

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-05-21 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280469#comment-13280469
]

Enis Soztutar commented on HBASE-6060:
--

Thanks Andrew for the pointer. Agreed that lowering the timeout can have deeper
impacts. We should fix the issue properly instead.

Regions's in OPENING state from failed regionservers takes a long time to
recover
-

Key: HBASE-6060
URL: https://issues.apache.org/jira/browse/HBASE-6060
Project: HBase
Issue Type: Bug
Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar

we have seen a pattern in tests, that the regions are stuck in OPENING state
for a very long time when the region server who is opening the region fails.
My understanding of the process:

[jira] [Created] (HBASE-6062) preCheckAndPut/Delete() checks for READ when also a WRITE is performed

Matteo Bertozzi created HBASE-6062:
--

 Summary: preCheckAndPut/Delete() checks for READ when also a WRITE 
is performed
 Key: HBASE-6062
 URL: https://issues.apache.org/jira/browse/HBASE-6062
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.0, 0.92.1, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.92.2, 0.96.0, 0.94.1


preCheckAndPut() and preCheckAndDelete() checks for READ when they also want to 
WRITE... 
for me checking for WRITE permission is the right thing... 
what do you say about that? keep READ, replace with WRITE?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6062) preCheckAndPut/Delete() checks for READ when also a WRITE is performed


 [ 
https://issues.apache.org/jira/browse/HBASE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6062:
---

Attachment: HBASE-6062-v0.patch

 preCheckAndPut/Delete() checks for READ when also a WRITE is performed
 --

 Key: HBASE-6062
 URL: https://issues.apache.org/jira/browse/HBASE-6062
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6062-v0.patch


 preCheckAndPut() and preCheckAndDelete() checks for READ when they also want 
 to WRITE... 
 for me checking for WRITE permission is the right thing... 
 what do you say about that? keep READ, replace with WRITE?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6062) preCheckAndPut/Delete() checks for READ when also a WRITE is performed


 [ 
https://issues.apache.org/jira/browse/HBASE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6062:
---

Status: Patch Available  (was: Open)

 preCheckAndPut/Delete() checks for READ when also a WRITE is performed
 --

 Key: HBASE-6062
 URL: https://issues.apache.org/jira/browse/HBASE-6062
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.0, 0.92.1, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6062-v0.patch


 preCheckAndPut() and preCheckAndDelete() checks for READ when they also want 
 to WRITE... 
 for me checking for WRITE permission is the right thing... 
 what do you say about that? keep READ, replace with WRITE?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6044) copytable: remove rs.* parameters


 [ 
https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6044:
--

Attachment: hbase-6044-92.patch

minor tweak for 0.92

 copytable: remove rs.* parameters
 -

 Key: HBASE-6044
 URL: https://issues.apache.org/jira/browse/HBASE-6044
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, 
 hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch


 In discussion of HBASE-6013 it was suggested that we remove these arguments 
 from 0.92+ (but keep in 0.90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6044) copytable: remove rs.* parameters


 [ 
https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6044:
--

   Resolution: Fixed
Fix Version/s: 0.94.1
   0.96.0
   0.92.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.92/0.94/0.96-trunk.  Thanks for review stack!

 copytable: remove rs.* parameters
 -

 Key: HBASE-6044
 URL: https://issues.apache.org/jira/browse/HBASE-6044
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, 
 hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch


 In discussion of HBASE-6013 it was suggested that we remove these arguments 
 from 0.92+ (but keep in 0.90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5757) TableInputFormat should handle as many errors as possible


 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5757:
--

   Resolution: Fixed
Fix Version/s: 0.92.2
   0.90.7
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Commited the 0.92 version to 0.92/0.90 branches.  Thanks for review Ted, thanks 
for patches Jan!

 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check

[
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280508#comment-13280508
]

Hadoop QA commented on HBASE-6061:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528475/HBASE-6061-v0.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1947//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1947//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1947//console

This message is automatically generated.

Fix ACL Admin Table inconsistent permission check
---

Key: HBASE-6061
URL: https://issues.apache.org/jira/browse/HBASE-6061
Project: HBase
Issue Type: Sub-task
Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Labels: acl, security
Fix For: 0.92.2, 0.96.0, 0.94.1

Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch

the requirePermission() check for admin operation on a table is currently
inconsistent.
Table Owner with CREATE rights (that means, the owner has created that table)
can enable/disable and delete the table but needs ADMIN rights to
add/remove/modify a column.

[jira] [Commented] (HBASE-6041) NullPointerException prevents the master from starting up


[ 
https://issues.apache.org/jira/browse/HBASE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280510#comment-13280510
 ] 

Zhihong Yu commented on HBASE-6041:
---

Patch looks good.
Do all tests pass ?

 NullPointerException prevents the master from starting up
 -

 Key: HBASE-6041
 URL: https://issues.apache.org/jira/browse/HBASE-6041
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase-6041.patch


 This is 0.90 only.
 2012-05-04 14:27:57,913 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:731)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:215)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:419)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
 2012-05-04 14:27:57,914 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-05-04 14:27:57,915 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 1433

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280516#comment-13280516
 ] 

Zhihong Yu commented on HBASE-6061:
---

@Matteo:
Do you mind providing patch for 0.92 / 0.94 ?
The directory structure has changed.

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-v0.patch, HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6033) Adding some fuction to check if a table/region is in compaction


[ 
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280518#comment-13280518
 ] 

Jimmy Xiang commented on HBASE-6033:


Here is the review request:

https://reviews.apache.org/r/5167/

 Adding some fuction to check if a table/region is in compaction
 ---

 Key: HBASE-6033
 URL: https://issues.apache.org/jira/browse/HBASE-6033
 Project: HBase
  Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: table_ui.png


 This feature will be helpful to find out if a major compaction is going on.
 We can show if it is in any minor compaction too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6057) Change some tests categories to optimize build time

2012-05-21 Thread Jean-Daniel Cryans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-6057:
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Ran the small tests with the patch, works ok. Committed.

 Change some tests categories to optimize build time
 ---

 Key: HBASE-6057
 URL: https://issues.apache.org/jira/browse/HBASE-6057
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 6057.v1.patch


 Some tests categorized as small takes more than 15s: it's better if they are 
 executed in // with the medium tests.
 Some medium tests last less than 2s: it's better to have then executed with 
 the small tests: we save a fork.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


 [ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6061:
---

Attachment: HBASE-6061-0.92.patch

Attached the 0.92 patch, also good for 0.94

 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6062) preCheckAndPut/Delete() checks for READ when also a WRITE is performed

[
https://issues.apache.org/jira/browse/HBASE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280534#comment-13280534
]

Hadoop QA commented on HBASE-6062:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528497/HBASE-6062-v0.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.coprocessor.TestMasterObserver
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1949//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1949//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1949//console

This message is automatically generated.

preCheckAndPut/Delete() checks for READ when also a WRITE is performed
--

Key: HBASE-6062
URL: https://issues.apache.org/jira/browse/HBASE-6062
Project: HBase
Issue Type: Sub-task
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Labels: acl, security
Fix For: 0.92.2, 0.96.0, 0.94.1

Attachments: HBASE-6062-v0.patch

preCheckAndPut() and preCheckAndDelete() checks for READ when they also want
to WRITE...
for me checking for WRITE permission is the right thing...
what do you say about that? keep READ, replace with WRITE?

[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters


[ 
https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280535#comment-13280535
 ] 

Hudson commented on HBASE-6044:
---

Integrated in HBase-TRUNK #2912 (See 
[https://builds.apache.org/job/HBase-TRUNK/2912/])
HBASE-6044 copytable: remove rs.* parameters (Revision 1341200)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/trunk/src/docbkx/ops_mgt.xml
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java


 copytable: remove rs.* parameters
 -

 Key: HBASE-6044
 URL: https://issues.apache.org/jira/browse/HBASE-6044
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, 
 hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch


 In discussion of HBASE-6013 it was suggested that we remove these arguments 
 from 0.92+ (but keep in 0.90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check

[
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280540#comment-13280540
]

Hadoop QA commented on HBASE-6061:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528491/HBASE-6061-v1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1950//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1950//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1950//console

This message is automatically generated.

Fix ACL Admin Table inconsistent permission check
---

Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch,
HBASE-6061-v1.patch

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check

[
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280583#comment-13280583
]

Hadoop QA commented on HBASE-6061:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528508/HBASE-6061-0.92.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestReplication

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1951//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1951//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1951//console

This message is automatically generated.

Fix ACL Admin Table inconsistent permission check
---

Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch,
HBASE-6061-v1.patch

[jira] [Commented] (HBASE-6062) preCheckAndPut/Delete() checks for READ when also a WRITE is performed


[ 
https://issues.apache.org/jira/browse/HBASE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280594#comment-13280594
 ] 

Andrew Purtell commented on HBASE-6062:
---

Patch looks good but please make sure TestAccessController includes tests for 
the change. 

 preCheckAndPut/Delete() checks for READ when also a WRITE is performed
 --

 Key: HBASE-6062
 URL: https://issues.apache.org/jira/browse/HBASE-6062
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6062-v0.patch


 preCheckAndPut() and preCheckAndDelete() checks for READ when they also want 
 to WRITE... 
 for me checking for WRITE permission is the right thing... 
 what do you say about that? keep READ, replace with WRITE?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6041) NullPointerException prevents the master from starting up


[ 
https://issues.apache.org/jira/browse/HBASE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280595#comment-13280595
 ] 

Jimmy Xiang commented on HBASE-6041:


Yes, all tests pass. Thanks.

 NullPointerException prevents the master from starting up
 -

 Key: HBASE-6041
 URL: https://issues.apache.org/jira/browse/HBASE-6041
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase-6041.patch


 This is 0.90 only.
 2012-05-04 14:27:57,913 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:731)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:215)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:419)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
 2012-05-04 14:27:57,914 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-05-04 14:27:57,915 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 1433

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6041) NullPointerException prevents the master from starting up


[ 
https://issues.apache.org/jira/browse/HBASE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280600#comment-13280600
 ] 

Zhihong Yu commented on HBASE-6041:
---

Integrated to 0.90 branch.

Thanks for the patch, Jimmy.

 NullPointerException prevents the master from starting up
 -

 Key: HBASE-6041
 URL: https://issues.apache.org/jira/browse/HBASE-6041
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase-6041.patch


 This is 0.90 only.
 2012-05-04 14:27:57,913 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:731)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:215)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:419)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
 2012-05-04 14:27:57,914 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-05-04 14:27:57,915 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 1433

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6043) Add Increment Coalescing in thrift.


 [ 
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6043:
-

Attachment: HBASE-6043-1.patch

 Add Increment Coalescing in thrift.
 ---

 Key: HBASE-6043
 URL: https://issues.apache.org/jira/browse/HBASE-6043
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: HBASE-6043-0.patch, HBASE-6043-1.patch


 Since the thrift server uses the client api reducing the number of rpc's 
 greatly speeds up increments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5979) Non-pread DFSInputStreams should be associated with scanners, not HFile.Readers

2012-05-21 Thread Kannan Muthukkaruppan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280634#comment-13280634
 ] 

Kannan Muthukkaruppan commented on HBASE-5979:
--

Todd: If we always use positional reads, we don't the benefit of HDFS sending 
the rest of the HDFS block, correct? So I didn't quite catch your recent 
suggestion. Did you mean, issue positional reads, but explicitly read a much 
larger chunk (in the Scan case) than just   the current block?

 Non-pread DFSInputStreams should be associated with scanners, not 
 HFile.Readers
 ---

 Key: HBASE-5979
 URL: https://issues.apache.org/jira/browse/HBASE-5979
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Reporter: Todd Lipcon

 Currently, every HFile.Reader has a single DFSInputStream, which it uses to 
 service all gets and scans. For gets, we use the positional read API (aka 
 pread) and for scans we use a synchronized block to seek, then read. The 
 advantage of pread is that it doesn't hold any locks, so multiple gets can 
 proceed at the same time. The advantage of seek+read for scans is that the 
 datanode starts to send the entire rest of the HDFS block, rather than just 
 the single hfile block necessary. So, in a single thread, pread is faster for 
 gets, and seek+read is faster for scans since you get a strong pipelining 
 effect.
 However, in a multi-threaded case where there are multiple scans (including 
 scans which are actually part of compactions), the seek+read strategy falls 
 apart, since only one scanner may be reading at a time. Additionally, a large 
 amount of wasted IO is generated on the datanode side, and we get none of the 
 earlier-mentioned advantages.
 In one test, I switched scans to always use pread, and saw a 5x improvement 
 in throughput of the YCSB scan-only workload, since it previously was 
 completely blocked by contention on the DFSIS lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5979) Non-pread DFSInputStreams should be associated with scanners, not HFile.Readers

2012-05-21 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280640#comment-13280640
]

Todd Lipcon commented on HBASE-5979:

Hey Kannan,

Sorry, let me elaborate on that suggestion:

The idea is to make a new FSReader implementation, which only has one API. That
API would look like the current positional read call (i.e take a position and
length).

Internally, it would have a pool of cached DFSInputStreams, and remember the
position for each of them. Each of the input streams would be referencing the
same file. When a read request comes in, it is matched against the pooled
streams: if it is within N bytes forward from the current position of one of
the streams, then a seek and read would be issued, synchronized on that stream.
Otherwise, any random stream would be chosen and a position read would be
chosen. Separately, we can track the last N positional reads: if we detect a
sequential pattern in the position reads, we can take one of the pooled input
streams and seek to the next predicted offset, so that future reads get the
sequential benefit.

Non-pread DFSInputStreams should be associated with scanners, not
HFile.Readers
---

Key: HBASE-5979
URL: https://issues.apache.org/jira/browse/HBASE-5979
Project: HBase
Issue Type: Improvement
Components: performance, regionserver
Reporter: Todd Lipcon

Currently, every HFile.Reader has a single DFSInputStream, which it uses to
service all gets and scans. For gets, we use the positional read API (aka
pread) and for scans we use a synchronized block to seek, then read. The
advantage of pread is that it doesn't hold any locks, so multiple gets can
proceed at the same time. The advantage of seek+read for scans is that the
datanode starts to send the entire rest of the HDFS block, rather than just
the single hfile block necessary. So, in a single thread, pread is faster for
gets, and seek+read is faster for scans since you get a strong pipelining
effect.
However, in a multi-threaded case where there are multiple scans (including
scans which are actually part of compactions), the seek+read strategy falls
apart, since only one scanner may be reading at a time. Additionally, a large
amount of wasted IO is generated on the datanode side, and we get none of the
earlier-mentioned advantages.
In one test, I switched scans to always use pread, and saw a 5x improvement
in throughput of the YCSB scan-only workload, since it previously was
completely blocked by contention on the DFSIS lock.

[jira] [Resolved] (HBASE-4686) [89-fb] Fix per-store metrics aggregation

2012-05-21 Thread Mikhail Bautin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-4686.
---

Resolution: Fixed

This has already been committed to trunk.

 [89-fb] Fix per-store metrics aggregation 
 --

 Key: HBASE-4686
 URL: https://issues.apache.org/jira/browse/HBASE-4686
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D87.1.patch, D87.2.patch, D87.3.patch, D87.4.patch, 
 HBASE-4686-TestRegionServerMetics-and-Store-metric-a-20111027134023-cc718144.patch,
  
 HBASE-4686-jira-89-fb-Fix-per-store-metrics-aggregat-20111027152723-05bea421.patch


 In r1182034 per-Store metrics were broken, because the aggregation of 
 StoreFile metrics over all stores in a region was replaced by overriding them 
 every time. We saw these metrics drop by a factor of numRegions on a 
 production cluster -- thanks to Kannan for noticing this!  We need to fix the 
 metrics and add a unit test to ensure regressions like this don't happen in 
 the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6043) Add Increment Coalescing in thrift.

[
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280643#comment-13280643
]

Hadoop QA commented on HBASE-6043:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528531/HBASE-6043-1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 35 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1952//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1952//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1952//console

This message is automatically generated.

Add Increment Coalescing in thrift.
---

Key: HBASE-6043
URL: https://issues.apache.org/jira/browse/HBASE-6043
Project: HBase
Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Attachments: HBASE-6043-0.patch, HBASE-6043-1.patch,
HBASE-6043-2.patch

Since the thrift server uses the client api reducing the number of rpc's
greatly speeds up increments.

[jira] [Updated] (HBASE-6063) Replication related failures on trunk after HBASE-5453

2012-05-21 Thread Gregory Chanan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-6063:
--

Status: Patch Available  (was: Open)

 Replication related failures on trunk after HBASE-5453
 --

 Key: HBASE-6063
 URL: https://issues.apache.org/jira/browse/HBASE-6063
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Attachments: HBASE-6063.patch


 HBASE-5453 added this line:
 {code}
 return ClusterId.parseFrom(data).toString();
 {code}
 in function:
 public static String readClusterIdZNode(ZooKeeperWatcher watcher)
 but this is not implemented, so you get log messages like:
 2012-05-21 16:46:31,256 ERROR 
 [RegionServer:0;cloudera-vm,60456,1337643971995-EventThread] 
 zookeeper.ClientCnxn$EventThread(523): Error while calling watcher 
 java.lang.IllegalArgumentException: Invalid UUID string: 
 org.apache.hadoop.hbase.ClusterId@5563d208
   at java.util.UUID.fromString(UUID.java:204)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.init(ReplicationSource.java:192)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.getReplicationSource(ReplicationSourceManager.java:328)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.addSource(ReplicationSourceManager.java:206)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:505)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:300)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
 2012-05-21 16:46:31,256 ERROR 
 [RegionServer:0;cloudera-vm,50926,1337643981835-EventThread] 
 zookeeper.ClientCnxn$EventThread(523): Error while calling watcher 
 and replication fails because the ClusterId does not match what is expected.  
 Patch coming soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters


[ 
https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280649#comment-13280649
 ] 

Hudson commented on HBASE-6044:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/])
HBASE-6044 copytable: remove rs.* parameters (Revision 1341200)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/trunk/src/docbkx/ops_mgt.xml
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java


 copytable: remove rs.* parameters
 -

 Key: HBASE-6044
 URL: https://issues.apache.org/jira/browse/HBASE-6044
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, 
 hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch


 In discussion of HBASE-6013 it was suggested that we remove these arguments 
 from 0.92+ (but keep in 0.90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6057) Change some tests categories to optimize build time


[ 
https://issues.apache.org/jira/browse/HBASE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280651#comment-13280651
 ] 

Hudson commented on HBASE-6057:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/])
HBASE-6057  Change some tests categories to optimize build time (nkeywal 
via JD) (Revision 1341211)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestPBOnWritableRpc.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDefaultLoadBalancer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/metrics/TestMetricsMBeanBase.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/monitoring/TestMemoryBoundedLogMessageBuffer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/monitoring/TestTaskMonitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestPoolMap.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java


 Change some tests categories to optimize build time
 ---

 Key: HBASE-6057
 URL: https://issues.apache.org/jira/browse/HBASE-6057
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 6057.v1.patch


 Some tests categorized as small takes more than 15s: it's better if they are 
 executed in // with the medium tests.
 Some medium tests last less than 2s: it's better to have then executed with 
 the small tests: we save a fork.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280650#comment-13280650
 ] 

Hudson commented on HBASE-5757:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/])
HBASE-5757 TableInputFormat should handle as many errors as possible (Jan 
Lukavsky) (Revision 1341132)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java


 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor


[ 
https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280653#comment-13280653
 ] 

Hudson commented on HBASE-5882:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/])
HBASE-5882 Prcoess RIT on master restart can try assigning the region if 
the region is found on a dead server instead of waiting for Timeout Monitor 
(Ashutosh) (Revision 1341110)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


 Prcoess RIT on master restart can try assigning the region if the region is 
 found on a dead server instead of waiting for Timeout Monitor
 -

 Key: HBASE-5882
 URL: https://issues.apache.org/jira/browse/HBASE-5882
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: Ashutosh Jindal
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5882_v5.patch, HBASE-5882_v6.patch, 
 hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch, 
 hbase_5882_V4.patch


 Currently on  master restart if it tries to do processRIT, any region if 
 found on dead server tries to avoid the nwe assignment so that timeout 
 monitor can take care.
 This case is more prominent if the node is found in RS_ZK_REGION_OPENING 
 state. I think we can handle this by triggering a new assignment with a new 
 plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280652#comment-13280652
 ] 

Hudson commented on HBASE-6061:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/])
HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo 
Bertozzi) (Revision 1341265)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6064) Add timestamp to Mutation Thrift API

2012-05-21 Thread Mikhail Bautin (JIRA)

Mikhail Bautin created HBASE-6064:
-

 Summary: Add timestamp to Mutation Thrift API
 Key: HBASE-6064
 URL: https://issues.apache.org/jira/browse/HBASE-6064
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


We need to be able to specify per-mutation timestamps in the HBase Thrift API. 
If the timestamp is not specified, the timestamp passed to the Thrift API 
method itself (mutateRowTs/mutateRowsTs) should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6043) Add Increment Coalescing in thrift.


[ 
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280657#comment-13280657
 ] 

Elliott Clark commented on HBASE-6043:
--

Looks like those tests are failing on trunk right now.

 Add Increment Coalescing in thrift.
 ---

 Key: HBASE-6043
 URL: https://issues.apache.org/jira/browse/HBASE-6043
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: HBASE-6043-0.patch, HBASE-6043-1.patch, 
 HBASE-6043-2.patch


 Since the thrift server uses the client api reducing the number of rpc's 
 greatly speeds up increments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6063) Replication related failures on trunk after HBASE-5453

[
https://issues.apache.org/jira/browse/HBASE-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280673#comment-13280673
]

Hadoop QA commented on HBASE-6063:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528539/HBASE-6063.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1954//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1954//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1954//console

This message is automatically generated.

Replication related failures on trunk after HBASE-5453
--

Key: HBASE-6063
URL: https://issues.apache.org/jira/browse/HBASE-6063
Project: HBase
Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Attachments: HBASE-6063.patch

HBASE-5453 added this line:
{code}
return ClusterId.parseFrom(data).toString();
{code}
in function:
public static String readClusterIdZNode(ZooKeeperWatcher watcher)
but this is not implemented, so you get log messages like:
2012-05-21 16:46:31,256 ERROR
[RegionServer:0;cloudera-vm,60456,1337643971995-EventThread]
zookeeper.ClientCnxn$EventThread(523): Error while calling watcher
java.lang.IllegalArgumentException: Invalid UUID string:
org.apache.hadoop.hbase.ClusterId@5563d208
at java.util.UUID.fromString(UUID.java:204)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.init(ReplicationSource.java:192)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.getReplicationSource(ReplicationSourceManager.java:328)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.addSource(ReplicationSourceManager.java:206)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:505)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:300)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
2012-05-21 16:46:31,256 ERROR
[RegionServer:0;cloudera-vm,50926,1337643981835-EventThread]
zookeeper.ClientCnxn$EventThread(523): Error while calling watcher
and replication fails because the ClusterId does not match what is expected.
Patch coming soon.

[jira] [Commented] (HBASE-6043) Add Increment Coalescing in thrift.

[
https://issues.apache.org/jira/browse/HBASE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280676#comment-13280676
]

Hadoop QA commented on HBASE-6043:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528534/HBASE-6043-2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestMasterReplication
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
org.apache.hadoop.hbase.replication.TestReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1953//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1953//console

This message is automatically generated.

Add Increment Coalescing in thrift.
---

Since the thrift server uses the client api reducing the number of rpc's
greatly speeds up increments.

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280680#comment-13280680
 ] 

Hudson commented on HBASE-6061:
---

Integrated in HBase-TRUNK #2914 (See 
[https://builds.apache.org/job/HBase-TRUNK/2914/])
HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo 
Bertozzi) (Revision 1341265)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280681#comment-13280681
 ] 

Hudson commented on HBASE-6061:
---

Integrated in HBase-0.94 #207 (See 
[https://builds.apache.org/job/HBase-0.94/207/])
HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo 
Bertozzi) (Revision 1341267)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280694#comment-13280694
 ] 

Hudson commented on HBASE-6061:
---

Integrated in HBase-0.92 #416 (See 
[https://builds.apache.org/job/HBase-0.92/416/])
HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo 
Bertozzi) (Revision 1341268)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


 [ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6065:


Component/s: wal
   Assignee: chunhui shen
Summary: Log for flush would append a non-sequential edit in the hlog， 
may cause data loss  (was: Log for flush would append a non-sequential edit in 
the hlog， may cause data los)

 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data los

chunhui shen created HBASE-6065:
---

 Summary: Log for flush would append a non-sequential edit in the 
hlog， may cause data los
 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


 [ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6065:


Attachment: HBASE-6065.patch

In the patch, I obtainSeqNum() for the flush log edit rather than the seqId 
from parameter.
So we could ensure the log seq id is always sequential in the file.
BTW, do we use the flush log edit anywhere?

There is another solution: change the splitted log file's name to the real max 
seq id, rather than the last seq id 

 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6065.patch


 After completing flush region, we will append a log edit in the hlog file 
 through HLog#completeCacheFlush.
 {code}
 public void completeCacheFlush(final byte [] encodedRegionName,
   final byte [] tableName, final long logSeqId, final boolean 
 isMetaRegion)
 {
 ...
 HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
 System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
 ...
 }
 {code}
 when we make the hlog key, we use the seqId from the parameter, and it is 
 generated by HLog#startCacheFlush,
 Here, we may append a lower seq id edit than the last edit in the hlog file.
 If it is the last edit log in the file, it may cause data loss.
 because 
 {code}
 HRegion#replayRecoveredEditsIfAny{
 ...
 maxSeqId = Math.abs(Long.parseLong(fileName));
   if (maxSeqId = minSeqId) {
 String msg = Maximum sequenceid for this log is  + maxSeqId
 +  and minimum sequenceid for the region is  + minSeqId
 + , skipped the whole file, path= + edits;
 LOG.debug(msg);
 continue;
   }
 ...
 }
 {code}
 We may skip the splitted log file, because we use the lase edit's seq id as 
 its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6033) Adding some fuction to check if a table/region is in compaction


 [ 
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6033:
---

Status: Patch Available  (was: Open)

 Adding some fuction to check if a table/region is in compaction
 ---

 Key: HBASE-6033
 URL: https://issues.apache.org/jira/browse/HBASE-6033
 Project: HBase
  Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-6033_v2.patch, table_ui.png


 This feature will be helpful to find out if a major compaction is going on.
 We can show if it is in any minor compaction too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6033) Adding some fuction to check if a table/region is in compaction


 [ 
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6033:
---

Attachment: hbase-6033_v2.patch

 Adding some fuction to check if a table/region is in compaction
 ---

 Key: HBASE-6033
 URL: https://issues.apache.org/jira/browse/HBASE-6033
 Project: HBase
  Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-6033_v2.patch, table_ui.png


 This feature will be helpful to find out if a major compaction is going on.
 We can show if it is in any minor compaction too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280704#comment-13280704
 ] 

chunhui shen commented on HBASE-6059:
-

bq.If majorCompaction is false, we still need to check !kvs.isEmpty(), right?
Yes, I think just about majorCompaction, minorCompaction will retain delete 
type, there is no problem.


 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96


[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280706#comment-13280706
 ] 

Zhihong Yu commented on HBASE-6055:
---

The design document is very good.
Will get back to reviewing HBASE-5547 first.

 Snapshots in HBase 0.96
 ---

 Key: HBASE-6055
 URL: https://issues.apache.org/jira/browse/HBASE-6055
 Project: HBase
  Issue Type: New Feature
  Components: client, master, regionserver, zookeeper
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: Snapshots in HBase.docx


 Continuation of HBASE-50 for the current trunk. Since the implementation has 
 drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280707#comment-13280707
 ] 

ramkrishna.s.vasudevan commented on HBASE-6065:
---

So this applies to 0.94 and above only right?

 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6065.patch


 After completing flush region, we will append a log edit in the hlog file 
 through HLog#completeCacheFlush.
 {code}
 public void completeCacheFlush(final byte [] encodedRegionName,
   final byte [] tableName, final long logSeqId, final boolean 
 isMetaRegion)
 {
 ...
 HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
 System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
 ...
 }
 {code}
 when we make the hlog key, we use the seqId from the parameter, and it is 
 generated by HLog#startCacheFlush,
 Here, we may append a lower seq id edit than the last edit in the hlog file.
 If it is the last edit log in the file, it may cause data loss.
 because 
 {code}
 HRegion#replayRecoveredEditsIfAny{
 ...
 maxSeqId = Math.abs(Long.parseLong(fileName));
   if (maxSeqId = minSeqId) {
 String msg = Maximum sequenceid for this log is  + maxSeqId
 +  and minimum sequenceid for the region is  + minSeqId
 + , skipped the whole file, path= + edits;
 LOG.debug(msg);
 continue;
   }
 ...
 }
 {code}
 We may skip the splitted log file, because we use the lase edit's seq id as 
 its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect

2012-05-21 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280709#comment-13280709
 ] 

Anoop Sam John commented on HBASE-5974:
---

Thanks for the review Todd
{quote}
why do we need the new RegionScannerWithCookie class? why not add the cookie to 
RegionScanner itself? 
{quote}
I was also thinking initially in this way. There are 2 reasons why I have 
avoided to do the seqNo work within the RegionScanner 
1. In case of the caching1 there will be more than one call to the 
RegionScanner.next(). U mean passing the client sent seqNo ( I am avoiding 
cookie as I agree with you to rename this ) to the RegionScanner which will 
change the interface. This is exposed
2. This is the main reason. With the CP usage we have exposed the RegionScanner 
and using the preScannerOpen() and postScannerOpen() impls user can now return 
his own RegionScanner impl. If we do this seqNo maintain and check logics in 
RegionScanner this will make the user to worry abt these? I feel this should be 
handled by HBase core code.  What do u say?

{quote}
this isn't currently compatible with 0.94, since a new client wouldn't be able 
to scan an old server.
{quote}
Agree.. I can fix this
{quote}
let's rename cookie to callSequenceNumber 
{quote}
Already agreed.. :) 
{quote}
In the test, I think you should use HRegionInterface directly, so you don't 
have to actually generate an RPC timeout.
{quote}
I thought of an E2E FT case.. Yes as u said the other one also I can write. So 
what is your recommendation? Should I change?
{quote}
 As is, I think it's also not guaranteed to trigger the issue unless you set 
scanner caching to 1, right? 
{quote}
May be in that case I can explicitly set the caching=1 for this test case. I 
can do that

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Priority: Critical
 Attachments: HBASE-5974_0.94.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280717#comment-13280717
 ] 

ramkrishna.s.vasudevan commented on HBASE-6065:
---

@Chunhui
What type of dataloass you see here? is it the edit with HBASE::CACHEFLUSH that 
gets missed here?
Ideally by design that edit is actually needed to show up to what point the 
flush has been done and the same is added as an entry in HLog.
Even while recovering we tend to skip this entry.
{code}
   // Check this edit is for me. Also, guard against writing the special
// METACOLUMN info such as HBASE::CACHEFLUSH entries
if (kv.matchingFamily(HLog.METAFAMILY) ||
!Bytes.equals(key.getEncodedRegionName(), 
this.regionInfo.getEncodedNameAsBytes())) {
  skippedEdits++;
  continue;
}
{code}
Did you find any other type of dataloss which i am not able to foresee here? 
Correct me if am wrong.

 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6065.patch


 After completing flush region, we will append a log edit in the hlog file 
 through HLog#completeCacheFlush.
 {code}
 public void completeCacheFlush(final byte [] encodedRegionName,
   final byte [] tableName, final long logSeqId, final boolean 
 isMetaRegion)
 {
 ...
 HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
 System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
 ...
 }
 {code}
 when we make the hlog key, we use the seqId from the parameter, and it is 
 generated by HLog#startCacheFlush,
 Here, we may append a lower seq id edit than the last edit in the hlog file.
 If it is the last edit log in the file, it may cause data loss.
 because 
 {code}
 HRegion#replayRecoveredEditsIfAny{
 ...
 maxSeqId = Math.abs(Long.parseLong(fileName));
   if (maxSeqId = minSeqId) {
 String msg = Maximum sequenceid for this log is  + maxSeqId
 +  and minimum sequenceid for the region is  + minSeqId
 + , skipped the whole file, path= + edits;
 LOG.debug(msg);
 continue;
   }
 ...
 }
 {code}
 We may skip the splitted log file, because we use the lase edit's seq id as 
 its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280718#comment-13280718
 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
---

I think only major compaction could lead us to this problem which probabaly 
deletes it.  
Incase of TTL expiry of all the entries in a store file, can we have this 
scenario of empty StoreFile getting created on minor or major compaction? I 
think creating empty store file should be fine.  Lets take others input also on 
this?

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6033) Adding some fuction to check if a table/region is in compaction

[
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280719#comment-13280719
]

Hadoop QA commented on HBASE-6033:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12528550/hbase-6033_v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestCompactionState
org.apache.hadoop.hbase.replication.TestReplication

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
org.apache.hadoop.hbase.replication.TestMasterReplication

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1955//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1955//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1955//console

This message is automatically generated.

Adding some fuction to check if a table/region is in compaction
---

Key: HBASE-6033
URL: https://issues.apache.org/jira/browse/HBASE-6033
Project: HBase
Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Attachments: hbase-6033_v2.patch, table_ui.png

This feature will be helpful to find out if a major compaction is going on.
We can show if it is in any minor compaction too.

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect

2012-05-21 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280728#comment-13280728
 ] 

Anoop Sam John commented on HBASE-5974:
---

Thanks for the review Jieshan
{quote}
So what's your suggestion, Anoop? call CP hooks in the finally section?
{quote}
I mean in whatever case when we close the scanner we need to call the CP hooks. 
Currently before this patch we were not doing this when getting a NSRE 
{code}
catch (Throwable t) {
  if (t instanceof NotServingRegionException) {
this.scanners.remove(scannerName);
  }
  throw convertThrowableToIOE(cleanup(t));
}
{code}
Here we can see it is not calling the CP hooks.  As of now in case of the 
cookie out of order also I am not contacting the CP hooks.

{quote}
RegionScanner scanner = scanners.get(scannerIdString).s;
{quote}
Oh yes. Thanks for pointing it out. I will fix.. This was not in that direct 
next() call flow.. That is why I missed..:(  

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Priority: Critical
 Attachments: HBASE-5974_0.94.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6033) Adding some fuction to check if a table/region is in compaction


[ 
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280729#comment-13280729
 ] 

Zhihong Yu commented on HBASE-6033:
---

@Jimmy:
Can you check why TestCompactionState failed ?

 Adding some fuction to check if a table/region is in compaction
 ---

 Key: HBASE-6033
 URL: https://issues.apache.org/jira/browse/HBASE-6033
 Project: HBase
  Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-6033_v2.patch, table_ui.png


 This feature will be helpful to find out if a major compaction is going on.
 We can show if it is in any minor compaction too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280731#comment-13280731
 ] 

chunhui shen commented on HBASE-6065:
-

Suppose region A on the regionserver B,
The issue could reproduce as the following step:

1.put one data to region A (append seq 1 in the hlog)
2.put one data to region A (append seq 2 in the hlog)
3.region A start flush,  it will call HLog#startCacheFlush (current seq num is 
3 in the hlog)
4.put one data to region A (append seq 4 in the hlog)
5.region A complete flush, it will call HLog#completeCacheFlush  (append seq 3 
in the hlog)
6.kill regionserver B.

So, the hlog file has four edit:
seq 1
seq 2
seq 4
seq 3

when splitting this hlog file, we generate the recoverd.edits file for region A 
which is named 3.(About the name, we could see HLogSplitter#splitLogFileToTemp)

Now, when replaying recoverd.edits file for region A, we will skip this file 
and cause data loss.





 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6065.patch


 After completing flush region, we will append a log edit in the hlog file 
 through HLog#completeCacheFlush.
 {code}
 public void completeCacheFlush(final byte [] encodedRegionName,
   final byte [] tableName, final long logSeqId, final boolean 
 isMetaRegion)
 {
 ...
 HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
 System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
 ...
 }
 {code}
 when we make the hlog key, we use the seqId from the parameter, and it is 
 generated by HLog#startCacheFlush,
 Here, we may append a lower seq id edit than the last edit in the hlog file.
 If it is the last edit log in the file, it may cause data loss.
 because 
 {code}
 HRegion#replayRecoveredEditsIfAny{
 ...
 maxSeqId = Math.abs(Long.parseLong(fileName));
   if (maxSeqId = minSeqId) {
 String msg = Maximum sequenceid for this log is  + maxSeqId
 +  and minimum sequenceid for the region is  + minSeqId
 + , skipped the whole file, path= + edits;
 LOG.debug(msg);
 continue;
   }
 ...
 }
 {code}
 We may skip the splitted log file, because we use the lase edit's seq id as 
 its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss


[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280734#comment-13280734
 ] 

chunhui shen commented on HBASE-6065:
-

I have tried to write a test, but it is a little hard.

We also could fix the issue using another solution (patchv2):
In current logic, we consider the last edit's seq id as the maximal seq id in 
the recoverd.edits file, however it is wrong because we can't ensure the 
sequentia edit in the hlog.
So we should changed the logic of find the maximal seq id for the 
recoverd.edits file, 
We only need do a little for the method 
HLogSplitter#updateRegionMaximumEditLogSeqNum.

 Log for flush would append a non-sequential edit in the hlog， may cause data 
 loss
 -

 Key: HBASE-6065
 URL: https://issues.apache.org/jira/browse/HBASE-6065
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6065.patch, HBASE-6065v2.patch


 After completing flush region, we will append a log edit in the hlog file 
 through HLog#completeCacheFlush.
 {code}
 public void completeCacheFlush(final byte [] encodedRegionName,
   final byte [] tableName, final long logSeqId, final boolean 
 isMetaRegion)
 {
 ...
 HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
 System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
 ...
 }
 {code}
 when we make the hlog key, we use the seqId from the parameter, and it is 
 generated by HLog#startCacheFlush,
 Here, we may append a lower seq id edit than the last edit in the hlog file.
 If it is the last edit log in the file, it may cause data loss.
 because 
 {code}
 HRegion#replayRecoveredEditsIfAny{
 ...
 maxSeqId = Math.abs(Long.parseLong(fileName));
   if (maxSeqId = minSeqId) {
 String msg = Maximum sequenceid for this log is  + maxSeqId
 +  and minimum sequenceid for the region is  + minSeqId
 + , skipped the whole file, path= + edits;
 LOG.debug(msg);
 continue;
   }
 ...
 }
 {code}
 We may skip the splitted log file, because we use the lase edit's seq id as 
 its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog， may cause data loss