[jira] [Resolved] (HBASE-10577) Remove unnecessary looping in FSHLog
[ https://issues.apache.org/jira/browse/HBASE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-10577. --- Resolution: Won't Fix (You are right [~busbey]) > Remove unnecessary looping in FSHLog > > > Key: HBASE-10577 > URL: https://issues.apache.org/jira/browse/HBASE-10577 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 0.99.0 >Reporter: Himanshu Vashishtha > > In the new disruptor based FSHLog, the Syncer threads are handed a batch of > SyncFuture objects from the RingBufferHandler. The Syncer then invokes a sync > call on the current writer instance. > This handing of batch is done in serially in RingBufferHandler, that is, > every syncer receives a non overlapping batch of SyncFutures. Once synced, > Syncer thread updates highestSyncedSequence. > In the run method of Syncer, we have: > {code} > long currentHighestSyncedSequence = highestSyncedSequence.get(); > if (currentSequence < currentHighestSyncedSequence) { > syncCount += releaseSyncFuture(takeSyncFuture, > currentHighestSyncedSequence, null); > // Done with the 'take'. Go around again and do a new 'take'. > continue; > } > {code} > I find this logic of polling the BlockingQueue again in this condition > un-necessary. When the currentHighestSyncedSequence is already greater than > currentSequence, then doesn't it mean some other Syncer has already synced > SyncFuture of these ops ? And, we should just go ahead and release all the > SyncFutures for this batch to unblock the handlers. That would avoid polling > the Blockingqueue for all SyncFuture objects in this case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11384) [Visibility Controller]Check for users covering authorizations for every mutation
[ https://issues.apache.org/jira/browse/HBASE-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11384: --- Status: Patch Available (was: Open) Will commit tomorrow unless objections and QA comes clean. > [Visibility Controller]Check for users covering authorizations for every > mutation > - > > Key: HBASE-11384 > URL: https://issues.apache.org/jira/browse/HBASE-11384 > Project: HBase > Issue Type: Sub-task >Affects Versions: 0.98.3 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.99.0, 0.98.5 > > Attachments: HBASE-11384.patch, HBASE-11384_1.patch, > HBASE-11384_2.patch, HBASE-11384_3.patch, HBASE-11384_4.patch, > HBASE-11384_6.patch, HBASE-11384_7.patch, HBASE-11384_8.patch > > > As part of discussions, it is better that every mutation either Put/Delete > with Visibility expressions should validate if the expression has labels for > which the user has authorization. If not fail the mutation. > Suppose User A is assoicated with A,B and C. The put has a visibility > expression A&D. Then fail the mutation as D is not associated with User A. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11625) Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum
[ https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080563#comment-14080563 ] ramkrishna.s.vasudevan commented on HBASE-11625: Can you paste the stack trace here? Are you observing this in 0.94 or in 0.98? > Reading datablock throws "Invalid HFile block magic" and can not switch to > hdfs checksum > - > > Key: HBASE-11625 > URL: https://issues.apache.org/jira/browse/HBASE-11625 > Project: HBase > Issue Type: Bug > Components: HFile >Affects Versions: 0.94.21, 0.98.4 >Reporter: qian wang > > when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it > could happen file corruption but it only can switch to hdfs checksum > inputstream till validateBlockChecksum(). If the datablock's header corrupted > when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" > and the rpc call fail -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11384) [Visibility Controller]Check for users covering authorizations for every mutation
[ https://issues.apache.org/jira/browse/HBASE-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11384: --- Attachment: HBASE-11384_8.patch Latest patch with comments addressed. > [Visibility Controller]Check for users covering authorizations for every > mutation > - > > Key: HBASE-11384 > URL: https://issues.apache.org/jira/browse/HBASE-11384 > Project: HBase > Issue Type: Sub-task >Affects Versions: 0.98.3 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.99.0, 0.98.5 > > Attachments: HBASE-11384.patch, HBASE-11384_1.patch, > HBASE-11384_2.patch, HBASE-11384_3.patch, HBASE-11384_4.patch, > HBASE-11384_6.patch, HBASE-11384_7.patch, HBASE-11384_8.patch > > > As part of discussions, it is better that every mutation either Put/Delete > with Visibility expressions should validate if the expression has labels for > which the user has authorization. If not fail the mutation. > Suppose User A is assoicated with A,B and C. The put has a visibility > expression A&D. Then fail the mutation as D is not associated with User A. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11438) [Visibility Controller] Support UTF8 character as Visibility Labels
[ https://issues.apache.org/jira/browse/HBASE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11438: --- Status: Open (was: Patch Available) > [Visibility Controller] Support UTF8 character as Visibility Labels > --- > > Key: HBASE-11438 > URL: https://issues.apache.org/jira/browse/HBASE-11438 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 0.98.4 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.98.5 > > Attachments: HBASE-11438_v1.patch, HBASE-11438_v2.patch, > HBASE-11438_v3.patch, HBASE-11438_v4.patch, HBASE-11438_v5.patch, > HBASE-11438_v6.patch > > > This would be an action item that we would be addressing so that the > visibility labels could have UTF8 characters in them. Also allow the user to > use a client supplied API that allows to specify the visibility labels inside > double quotes such that UTF8 characters and cases like &, |, ! and double > quotes itself could be specified with proper escape sequence. Accumulo too > provides one such API in the client side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11438) [Visibility Controller] Support UTF8 character as Visibility Labels
[ https://issues.apache.org/jira/browse/HBASE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11438: --- Status: Patch Available (was: Open) > [Visibility Controller] Support UTF8 character as Visibility Labels > --- > > Key: HBASE-11438 > URL: https://issues.apache.org/jira/browse/HBASE-11438 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 0.98.4 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.98.5 > > Attachments: HBASE-11438_v1.patch, HBASE-11438_v2.patch, > HBASE-11438_v3.patch, HBASE-11438_v4.patch, HBASE-11438_v5.patch, > HBASE-11438_v6.patch > > > This would be an action item that we would be addressing so that the > visibility labels could have UTF8 characters in them. Also allow the user to > use a client supplied API that allows to specify the visibility labels inside > double quotes such that UTF8 characters and cases like &, |, ! and double > quotes itself could be specified with proper escape sequence. Accumulo too > provides one such API in the client side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11438) [Visibility Controller] Support UTF8 character as Visibility Labels
[ https://issues.apache.org/jira/browse/HBASE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11438: --- Attachment: HBASE-11438_v6.patch Trying for QA > [Visibility Controller] Support UTF8 character as Visibility Labels > --- > > Key: HBASE-11438 > URL: https://issues.apache.org/jira/browse/HBASE-11438 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 0.98.4 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.98.5 > > Attachments: HBASE-11438_v1.patch, HBASE-11438_v2.patch, > HBASE-11438_v3.patch, HBASE-11438_v4.patch, HBASE-11438_v5.patch, > HBASE-11438_v6.patch > > > This would be an action item that we would be addressing so that the > visibility labels could have UTF8 characters in them. Also allow the user to > use a client supplied API that allows to specify the visibility labels inside > double quotes such that UTF8 characters and cases like &, |, ! and double > quotes itself could be specified with proper escape sequence. Accumulo too > provides one such API in the client side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11384) [Visibility Controller]Check for users covering authorizations for every mutation
[ https://issues.apache.org/jira/browse/HBASE-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11384: --- Status: Open (was: Patch Available) > [Visibility Controller]Check for users covering authorizations for every > mutation > - > > Key: HBASE-11384 > URL: https://issues.apache.org/jira/browse/HBASE-11384 > Project: HBase > Issue Type: Sub-task >Affects Versions: 0.98.3 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.99.0, 0.98.5 > > Attachments: HBASE-11384.patch, HBASE-11384_1.patch, > HBASE-11384_2.patch, HBASE-11384_3.patch, HBASE-11384_4.patch, > HBASE-11384_6.patch, HBASE-11384_7.patch > > > As part of discussions, it is better that every mutation either Put/Delete > with Visibility expressions should validate if the expression has labels for > which the user has authorization. If not fail the mutation. > Suppose User A is assoicated with A,B and C. The put has a visibility > expression A&D. Then fail the mutation as D is not associated with User A. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11625) Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum
qian wang created HBASE-11625: - Summary: Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum Key: HBASE-11625 URL: https://issues.apache.org/jira/browse/HBASE-11625 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.98.4, 0.94.21 Reporter: qian wang when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it could happen file corruption but it only can switch to hdfs checksum inputstream till validateBlockChecksum(). If the datablock's header corrupted when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" and the rpc call fail -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080550#comment-14080550 ] Hadoop QA commented on HBASE-11621: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658836/11621-0.98.txt against trunk revision . ATTACHMENT ID: 12658836 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10239//console This message is automatically generated. > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-0.98.txt, 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080549#comment-14080549 ] ramkrishna.s.vasudevan commented on HBASE-11591: {code} Error: java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.LabelExpander.getLabelOrdinals(LabelExpander.java:129) at org.apache.hadoop.hbase.mapreduce.LabelExpander.getLabelOrdinals(LabelExpander.java:145) at org.apache.hadoop.hbase.mapreduce.LabelExpander.createVisibilityTags(LabelExpander.java:105) at org.apache.hadoop.hbase.mapreduce.LabelExpander.createKVFromCellVisibilityExpr(LabelExpander.java:217) at org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.createPuts(TsvImporterMapper.java:195) at org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:153) at org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) {code} > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080547#comment-14080547 ] ramkrishna.s.vasudevan commented on HBASE-11591: Not sure on other test cases failures but the new test case added TestScannerWithBulkLoad fails here {code} protected void checkScanOrder(Cell prevKV, Cell kv, KeyValue.KVComparator comparator) throws IOException { // Check that the heap gives us KVs in an increasing order. assert prevKV == null || comparator == null || comparator.compare(prevKV, kv) <= 0 : "Key " + prevKV + " followed by a " + "smaller key " + kv + " in cf " + store; } {code} So can we remove that assertion? This change is becoming trickier. > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080548#comment-14080548 ] ramkrishna.s.vasudevan commented on HBASE-11591: Other test case seems to fail on some env issues. > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11624) TestScannerResource hangs in trunk
Ted Yu created HBASE-11624: -- Summary: TestScannerResource hangs in trunk Key: HBASE-11624 URL: https://issues.apache.org/jira/browse/HBASE-11624 Project: HBase Issue Type: Test Reporter: Ted Yu Priority: Minor I checked console log for the recent trunk builds - I couldn't find TestScannerResource. I got the following stack trace when running the test locally: {code} "pool-1-thread-1" prio=10 tid=0x7f7d8c787000 nid=0x3803 runnable [0x7f7d63783000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) - locked <0x0007fb156098> (a java.io.BufferedInputStream) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.hadoop.hbase.rest.client.Client.executeURI(Client.java:191) at org.apache.hadoop.hbase.rest.client.Client.executePathOnly(Client.java:163) at org.apache.hadoop.hbase.rest.client.Client.execute(Client.java:214) at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:402) at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:370) at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:354) at org.apache.hadoop.hbase.rest.TestScannerResource.testSimpleScannerXML(TestScannerResource.java:197) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080522#comment-14080522 ] Lars Hofhansl edited comment on HBASE-11623 at 7/31/14 5:34 AM: Good find. Or just {{acquiredRowLocks.size() + 1}}? +1 for 0.94 as well. was (Author: lhofhansl): Good fine. Or just {{acquiredRowLocks.size() + 1}}? +1 for 0.94 as well. > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > Attachments: HBASE-11623-trunk-v1.patch > > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080522#comment-14080522 ] Lars Hofhansl commented on HBASE-11623: --- Good fine. Or just {{acquiredRowLocks.size() + 1}}? +1 for 0.94 as well. > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > Attachments: HBASE-11623-trunk-v1.patch > > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080520#comment-14080520 ] Hudson commented on HBASE-11558: FAILURE: Integrated in HBase-1.0 #76 (See [https://builds.apache.org/job/HBase-1.0/76/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev 2af67c298645361b86a424362e705c2501e0c1eb) * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/protobuf/Client.proto > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080521#comment-14080521 ] Hudson commented on HBASE-11615: FAILURE: Integrated in HBase-1.0 #76 (See [https://builds.apache.org/job/HBase-1.0/76/]) HBASE-11615 TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins (jxiang: rev be816b18a44807bfc5f11a3b3f5792da1f8544eb) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080515#comment-14080515 ] Ted Yu commented on HBASE-11623: lgtm > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > Attachments: HBASE-11623-trunk-v1.patch > > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080510#comment-14080510 ] cuijianwei commented on HBASE-11623: [~tedyu], thanks for your comment, I added patches for trunk. > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > Attachments: HBASE-11623-trunk-v1.patch > > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-11623: --- Attachment: HBASE-11623-trunk-v1.patch > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > Attachments: HBASE-11623-trunk-v1.patch > > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080492#comment-14080492 ] Ted Yu commented on HBASE-11623: Sounds good. Do you want to attach a patch ? > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-11623: --- Description: mutateRowsWithLocks will acquire updatesLock.readLock by the following code: {code} ... lock(this.updatesLock.readLock(), acquiredRowLocks.size()); ... {code} However, acquiredRowLocks might be empty, and then the waitTime of HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail if can not acquire updatesLock.readLock immediately. In our environment, we implement a region coprocessor which need to hold row locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to mutateRowsWithLocks) will be an empty set, and we get the following exception occasionally: {code} org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) 583 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) 584 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) ... {code} Is it reasonable that we use default waitTime when rowsToLock is empty? (as the following code) {code} lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : acquiredRowLocks.size()); {code} was: mutateRowsWithLocks will acquire updatesLock.readLock by the following code: {code} ... lock(this.updatesLock.readLock(), acquiredRowLocks.size()); ... {code} However, acquiredRowLocks might be empty, and then the waitTime of HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail if can not acquire updatesLock.readLock immediately. In our environment, we implement a region coprocessor which need to hold row locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to mutateRowsWithLocks) will be an empty set, and we get the following exception occasionally: {code} org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) 583 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) 584 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) ... {code} Is it reasonable that we use default waitTime when rowsToLock is empty? (as the following code) {code} lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : acquiredRowLocks.size() ); {code} > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
[ https://issues.apache.org/jira/browse/HBASE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-11623: --- Description: mutateRowsWithLocks will acquire updatesLock.readLock by the following code: {code} ... lock(this.updatesLock.readLock(), acquiredRowLocks.size()); ... {code} However, acquiredRowLocks might be empty, and then the waitTime of HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail if can not acquire updatesLock.readLock immediately. In our environment, we implement a region coprocessor which need to hold row locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to mutateRowsWithLocks) will be an empty set, and we get the following exception occasionally: {code} org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) 583 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) 584 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) ... {code} Is it reasonable that we use default waitTime when rowsToLock is empty? (as the following code) {code} lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : acquiredRowLocks.size() ); {code} was: mutateRowsWithLocks will acquire updatesLock.readLock by the following code: {code} ... lock(this.updatesLock.readLock(), acquiredRowLocks.size()); ... {code} However, acquiredRowLocks might be empty, and then the waitTime of HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail if can not acquire updatesLock.readLock immediately. In our environment, we implement a region coprocessor which need to hold row locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to mutateRowsWithLocks) will be an empty set, and we get the following exception occasionally: {code} org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) 583 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) 584 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) ... {code} Is it reasonable that we use default waitTime when rowsToLock is empty? (as the following code) {code} lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : acquiredRowLocks.size() ); {code} > mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 > -- > > Key: HBASE-11623 > URL: https://issues.apache.org/jira/browse/HBASE-11623 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.96.1.1, 0.94.21, 0.98.4 >Reporter: cuijianwei >Priority: Minor > > mutateRowsWithLocks will acquire updatesLock.readLock by the following code: > {code} > ... > lock(this.updatesLock.readLock(), acquiredRowLocks.size()); > ... > {code} > However, acquiredRowLocks might be empty, and then the waitTime of > HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail > if can not acquire updatesLock.readLock immediately. > In our environment, we implement a region coprocessor which need to hold row > locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to > mutateRowsWithLocks) will be an empty set, and we get the following exception > occasionally: > {code} > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms > > > 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) > 583 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) > 584 at > org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) > ... > {code} > Is it reasonable that we use default waitTime when rowsToLock is empty? (as > the following code) > {code} > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size() ); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11623) mutateRowsWithLocks might require updatesLock.readLock with waitTime=0
cuijianwei created HBASE-11623: -- Summary: mutateRowsWithLocks might require updatesLock.readLock with waitTime=0 Key: HBASE-11623 URL: https://issues.apache.org/jira/browse/HBASE-11623 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.4, 0.94.21, 0.96.1.1 Reporter: cuijianwei Priority: Minor mutateRowsWithLocks will acquire updatesLock.readLock by the following code: {code} ... lock(this.updatesLock.readLock(), acquiredRowLocks.size()); ... {code} However, acquiredRowLocks might be empty, and then the waitTime of HRegion.lock(...) will be set to 0, which will make mutateRowsWithLocks fail if can not acquire updatesLock.readLock immediately. In our environment, we implement a region coprocessor which need to hold row locks before invoke mutateRowsWithLocks. Then, the rowsToLock(passed to mutateRowsWithLocks) will be an empty set, and we get the following exception occasionally: {code} org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 0ms 582 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:6191) 583 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5126) 584 at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:5034) ... {code} Is it reasonable that we use default waitTime when rowsToLock is empty? (as the following code) {code} lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : acquiredRowLocks.size() ); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11621: --- Attachment: 11621-0.98.txt Patch for 0.98 If EditLogFileOutputStream is not accessible, log the fact and continue. > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-0.98.txt, 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11516) Track time spent in executing coprocessors in each region.
[ https://issues.apache.org/jira/browse/HBASE-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080483#comment-14080483 ] Andrew Purtell commented on HBASE-11516: Going to commit tomorrow unless objection > Track time spent in executing coprocessors in each region. > -- > > Key: HBASE-11516 > URL: https://issues.apache.org/jira/browse/HBASE-11516 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.98.4 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11516.patch, HBASE-11516_v2.patch, > HBASE-11516_v3.patch, HBASE-11516_v4.patch, HBASE-11516_v4_master.patch, > region_server_webui.png, rs_web_ui_v2.png > > > Currently, the time spent in executing coprocessors is not yet being tracked. > This feature can be handy for debugging coprocessors in case of any trouble. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11516) Track time spent in executing coprocessors in each region.
[ https://issues.apache.org/jira/browse/HBASE-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11516: --- Attachment: HBASE-11516_v4_master.patch v4 patch is against 0.98, attaching patch for master and branch-1 > Track time spent in executing coprocessors in each region. > -- > > Key: HBASE-11516 > URL: https://issues.apache.org/jira/browse/HBASE-11516 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.98.4 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11516.patch, HBASE-11516_v2.patch, > HBASE-11516_v3.patch, HBASE-11516_v4.patch, HBASE-11516_v4_master.patch, > region_server_webui.png, rs_web_ui_v2.png > > > Currently, the time spent in executing coprocessors is not yet being tracked. > This feature can be handy for debugging coprocessors in case of any trouble. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080472#comment-14080472 ] Hudson commented on HBASE-11558: FAILURE: Integrated in hbase-0.96-hadoop2 #287 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/287/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev efdbe072ef7e910259360bfb01bc4200eab86a4f) * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/protobuf/Client.proto * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-9531) a command line (hbase shell) interface to retreive the replication metrics and show replication lag
[ https://issues.apache.org/jira/browse/HBASE-9531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080450#comment-14080450 ] Hadoop QA commented on HBASE-9531: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658820/HBASE-9531-master-v2.patch against trunk revision . ATTACHMENT ID: 12658820 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + new java.lang.String[] { "PeerID", "AgeOfLastShippedOp", "SizeOfLogQueue", "TimeStampOfLastShippedOp", "ReplicationLag", }); + new java.lang.String[] { "NumberOfRequests", "TotalNumberOfRequests", "UsedHeapMB", "MaxHeapMB", "RegionLoads", "Coprocessors", "ReportStartTime", "ReportEndTime", "InfoServerPort", "ReplLoadSource", "ReplLoadSink", }); +if (!@admin.getConfiguration().getBoolean(org.apache.hadoop.hbase.HConstants::REPLICATION_ENABLE_KEY, org.apache.hadoop.hbase.HConstants::REPLICATION_ENABLE_DEFAULT)) +rSinkString << ", TimeStampsOfLastAppliedOp=" + (java.util.Date.new(rLoadSink.getTimeStampsOfLastAppliedOp())).toString() + rSourceString << ", TimeStampsOfLastShippedOp=" + (java.util.Date.new(rLoadSource.getTimeStampOfLastShippedOp())).toString() {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestIOFencing {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10238//console This message is automatically generated. > a command line (hbase shell) interface to retreive the replication metrics > and show replication lag > --- > > Key: HBASE-9531 > URL: https://issues.apache.org/jira/browse/HBASE-9531 > Project: HBase > Issue Type: New Feature > Components: Replication >Affects Versions: 0.99.0 >Reporter: Demai Ni >Assignee: Demai Ni > Fix For: 0.99.0, 0.98.5 > > Attachments: HBASE-9531-master-v1.patch, HBASE-9531-master-v1.patch, > HBASE-9531
[jira] [Commented] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080422#comment-14080422 ] ryan rawson commented on HBASE-11567: - also what's up with the failing tests, how could this patch conceivably affect org.apache.hadoop.hbase.migration.TestNamespaceUpgrade Just a general expression of annoyance at test brittleness > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Alex Newman > Attachments: HBASE-11567-v1.patch > > > Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and > region open/close (HBASE-11512) , we should persist bulk load events to WAL. > This is especially important for secondary region replicas, since we can use > this information to pick up primary regions' files from secondary replicas. > A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080421#comment-14080421 ] ryan rawson commented on HBASE-11567: - Good generally speaking. The bulk load tests should really verify more behavior. In the 'successful' cases, what kind of behavior has changed about the HRegion that we could check? (if anything - it might not be feasible since HRegion isn't SOLID) I also could imagine a series of integration tests that verify that the data in the bulk loaded file is readable. There is also some stress tests that involve loading during concurrent operations and what should happen in those cases. That is a separate integration test. commented on 154f10b hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java:L2748: I want to see a space between ){ commented on 154f10b hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java:L2750: what's the line length max? Are we on Java7, can we use diamonds now? commented on 154f10b hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java:L2751: spacing here, between the : and the ){ commented on 154f10b hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java:L322: im a little vaugely unhappy how creating all these structures is spread about, but that is a greater issue, and we cant fix it now. commented on 154f10b hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:L3667: spacing here: if (log != null) { commented on 154f10b hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:L3672 : so the log doesnt really know how to write business logicy entries, so we have a utility class that does it. So who is really responsible for doing these things? The obvious design solution is a wrapper around HLog which allows pluggable log under it, we should file a new HBase JIRA. one thing that bothers me about this line is we are hiding a pretty heavy duty IO event, and its not really apparent, maybe change the name to HLogUtil.writeBulkLoadMarkerAndSync() so its apparent commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L46 : writes to files = not unit test commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L64: are these constants? if so, final them, upcaps them commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L76: Isn't there already a constant that has this in it somewhere else? the answer is yes, a package protected one in WALEdit - lets not copy magic constants about, DRY that up commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L121: hide this (and others like this) behind a static function commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L144 : space commented on 154f10b hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java:L168: maybe call this bulkLogWalEditType -- it might read better when it's being used. > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Alex Newman > Attachments: HBASE-11567-v1.patch > > > Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and > region open/close (HBASE-11512) , we should persist bulk load events to WAL. > This is especially important for secondary region replicas, since we can use > this information to pick up primary regions' files from secondary replicas. > A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080408#comment-14080408 ] Hadoop QA commented on HBASE-11567: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658800/HBASE-11567-v1.patch against trunk revision . ATTACHMENT ID: 12658800 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * Generates a marker for the WAL so that we propagate the notion of a bulk region load throughout the WAL. + * @param familyPaths A list of pairs which maps the name of the column family to the location on disk where it is + * @param assignSeqId Whether or not to assign a new sequence ID (by way of calling flush) or to use the one provided + List> familyPaths, + new java.lang.String[] { "TableName", "EncodedRegionName", "FamilyPath", "AssignSeqNum", }); + WALProtos.BulkLoadDescriptor loadDescriptor = ProtobufUtil.toBulkLoadDescriptor(this.getRegionInfo().getTable(), + * @param sequenceId The current sequenceId in the log at the time when we were to write the bulk load marker. + public static WALEdit createBulkLoadEvent(HRegionInfo hri, WALProtos.BulkLoadDescriptor bulkLoadDescriptor) { +testRegionWithFamilies(family1, family2).bulkLoadHFiles(withFamilyPathsFor(family1, family2), false); + public void bulkHLogShouldThrowErrorWhenFamilySpecifiedAndHFileExistsButNotInTableDescriptor() throws IOException { {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10237//console This message is automatically generated. > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >
[jira] [Resolved] (HBASE-11622) completebulkload/loadIncrementalHFiles cannot specify table with namespace
[ https://issues.apache.org/jira/browse/HBASE-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianshi Huang resolved HBASE-11622. --- Resolution: Fixed Release Note: Already fixed by HBASE-11609 > completebulkload/loadIncrementalHFiles cannot specify table with namespace > -- > > Key: HBASE-11622 > URL: https://issues.apache.org/jira/browse/HBASE-11622 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Jianshi Huang > > I'm using completebulkload to load 500GB of data to a table (presplitted). > However, it reports the following errors: > Looks like completebulkload didn't recognize the namespace part > (namespace:table). > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > grapple:vertices,37.bottom > at java.net.URI.checkPath(URI.java:1804) > at java.net.URI.(URI.java:752) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > By looking at the source code of LoadIncrementalHFiles.java, it seems the > temporary path created for splitting will contain ':', > The error part should be this: > String uniqueName = getUniqueName(table.getName()); > HColumnDescriptor familyDesc = > table.getTableDescriptor().getFamily(item.family); > Path botOut = new Path(tmpDir, uniqueName + ".bottom"); > Path topOut = new Path(tmpDir, uniqueName + ".top"); > splitStoreFile(getConf(), hfilePath, familyDesc, splitKey, > botOut, topOut); > uniqueName will be "namespce:table" so "new Path(...)" will fail. > A bug? > P.S. > Comment from Matteo Bertozzi: > we don't need the name to be related to the table name. > We can replace the getUniqueName() using something like this > String getUniqueName(TableName tableName) { > String name = UUID.randomUUID().toString().replaceAll("-", "") + > "," + regionCount.incrementAndGet(); > return name; > } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080399#comment-14080399 ] Hudson commented on HBASE-11616: FAILURE: Integrated in HBase-TRUNK #5357 (See [https://builds.apache.org/job/HBase-TRUNK/5357/]) HBASE-11616 TestNamespaceUpgrade fails in trunk (jxiang: rev f1c5741f9b3b2c0ab8ed8cc85757098aa51a8728) * hbase-server/src/test/java/org/apache/hadoop/hbase/migration/TestNamespaceUpgrade.java > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) > ... > Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No > server address listed in hbase:meta for region hbase:acl,,1376029204842. > 06dfcfc239196403c5f1135b91dedc64. containing row > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) > ... 31 more > {code} > The cause for the above error is that the _acl_ table contained in the image > (w.r.t. hbase:meta table) doesn't have server address. > [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-9531) a command line (hbase shell) interface to retreive the replication metrics and show replication lag
[ https://issues.apache.org/jira/browse/HBASE-9531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-9531: Attachment: HBASE-9531-master-v2.patch uploaded v2 patch for master which is changed according to [~enis] suggestion > a command line (hbase shell) interface to retreive the replication metrics > and show replication lag > --- > > Key: HBASE-9531 > URL: https://issues.apache.org/jira/browse/HBASE-9531 > Project: HBase > Issue Type: New Feature > Components: Replication >Affects Versions: 0.99.0 >Reporter: Demai Ni >Assignee: Demai Ni > Fix For: 0.99.0, 0.98.5 > > Attachments: HBASE-9531-master-v1.patch, HBASE-9531-master-v1.patch, > HBASE-9531-master-v1.patch, HBASE-9531-master-v2.patch, > HBASE-9531-trunk-v0.patch, HBASE-9531-trunk-v0.patch > > > This jira is to provide a command line (hbase shell) interface to retreive > the replication metrics info such as:ageOfLastShippedOp, > timeStampsOfLastShippedOp, sizeOfLogQueue ageOfLastAppliedOp, and > timeStampsOfLastAppliedOp. And also to provide a point of time info of the > lag of replication(source only) > Understand that hbase is using Hadoop > metrics(http://hbase.apache.org/metrics.html), which is a common way to > monitor metric info. This Jira is to serve as a light-weight client > interface, comparing to a completed(certainly better, but heavier)GUI > monitoring package. I made the code works on 0.94.9 now, and like to use this > jira to get opinions about whether the feature is valuable to other > users/workshop. If so, I will build a trunk patch. > All inputs are greatly appreciated. Thank you! > The overall design is to reuse the existing logic which supports hbase shell > command 'status', and invent a new module, called ReplicationLoad. In > HRegionServer.buildServerLoad() , use the local replication service objects > to get their loads which could be wrapped in a ReplicationLoad object and > then simply pass it to the ServerLoad. In ReplicationSourceMetrics and > ReplicationSinkMetrics, a few getters and setters will be created, and ask > Replication to build a "ReplicationLoad". (many thanks to Jean-Daniel for > his kindly suggestions through dev email list) > the replication lag will be calculated for source only, and use this formula: > {code:title=Replication lag|borderStyle=solid} > if sizeOfLogQueue != 0 then max(ageOfLastShippedOp, (current time - > timeStampsOfLastShippedOp)) //err on the large side > else if (current time - timeStampsOfLastShippedOp) < 2* > ageOfLastShippedOp then lag = ageOfLastShippedOp // last shipped happen > recently > else lag = 0 // last shipped may happens last night, so NO real lag > although ageOfLastShippedOp is non-zero > {code} > External will look something like: > {code:title=status 'replication'|borderStyle=solid} > hbase(main):001:0> status 'replication' > version 0.94.9 > 3 live servers > Â Â Â hdtest017.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=14, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:49:48 PDT 2013 > Â Â Â SINKÂ :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:48:48 PDT 2013 > Â Â Â hdtest018.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013 > Â Â Â SINKÂ :AgeOfLastAppliedOp=14, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:50:59 PDT 2013 > Â Â Â hdtest015.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013 > Â Â Â SINKÂ :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:48:48 PDT 2013 > hbase(main):002:0> status 'replication','source' > version 0.94.9 > 3 live servers > Â Â Â hdtest017.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=14, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:49:48 PDT 2013 > Â Â Â hdtest018.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013 > Â Â Â hdtest015.svl.ibm.com: > Â Â Â SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, > timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013 > hbase(main):003:0> status 'replication','sink' > version 0.94.9 > 3 live servers > Â Â Â hdtest017.svl.ibm.com: > Â Â Â SINKÂ :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:48:48 PDT 2013 > Â Â Â hdtest018.svl.ibm.com: > Â Â Â SINKÂ :AgeOfLastAppliedOp=14, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:50:59 PDT 2013 > Â Â Â hdtest015.svl.ibm.com: > Â Â Â SINKÂ :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 > 14:48:48 PDT 2013 > hbase(main):003:0> status 'replication','l
[jira] [Created] (HBASE-11622) completebulkload/loadIncrementalHFiles cannot specify table with namespace
Jianshi Huang created HBASE-11622: - Summary: completebulkload/loadIncrementalHFiles cannot specify table with namespace Key: HBASE-11622 URL: https://issues.apache.org/jira/browse/HBASE-11622 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Jianshi Huang I'm using completebulkload to load 500GB of data to a table (presplitted). However, it reports the following errors: Looks like completebulkload didn't recognize the namespace part (namespace:table). Caused by: java.net.URISyntaxException: Relative path in absolute URI: grapple:vertices,37.bottom at java.net.URI.checkPath(URI.java:1804) at java.net.URI.(URI.java:752) at org.apache.hadoop.fs.Path.initialize(Path.java:203) By looking at the source code of LoadIncrementalHFiles.java, it seems the temporary path created for splitting will contain ':', The error part should be this: String uniqueName = getUniqueName(table.getName()); HColumnDescriptor familyDesc = table.getTableDescriptor().getFamily(item.family); Path botOut = new Path(tmpDir, uniqueName + ".bottom"); Path topOut = new Path(tmpDir, uniqueName + ".top"); splitStoreFile(getConf(), hfilePath, familyDesc, splitKey, botOut, topOut); uniqueName will be "namespce:table" so "new Path(...)" will fail. A bug? P.S. Comment from Matteo Bertozzi: we don't need the name to be related to the table name. We can replace the getUniqueName() using something like this String getUniqueName(TableName tableName) { String name = UUID.randomUUID().toString().replaceAll("-", "") + "," + regionCount.incrementAndGet(); return name; } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080390#comment-14080390 ] Ted Yu commented on HBASE-11621: Looks like some reflection is needed if this change goes to 0.98 For hadoop-1, I got: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /Users/tyu/98/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[102,45] error: cannot find symbol [ERROR] symbol: class EditLogFileOutputStream [ERROR] location: package org.apache.hadoop.hdfs.server.namenode [ERROR] /Users/tyu/98/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[510,4] error: cannot find symbol {code} > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080379#comment-14080379 ] Hudson commented on HBASE-11615: FAILURE: Integrated in HBase-TRUNK #5356 (See [https://builds.apache.org/job/HBase-TRUNK/5356/]) HBASE-11615 TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins (jxiang: rev 22445d0ebb6d8f25b25f522a8e9b43b0433930f3) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4192) Optimize HLog for Throughput Using Delayed RPCs
[ https://issues.apache.org/jira/browse/HBASE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080375#comment-14080375 ] Sean Busbey commented on HBASE-4192: Part 1 was handled via HBASE-4487. Is Part 2 OBE from the changes in HBASE-8755? > Optimize HLog for Throughput Using Delayed RPCs > --- > > Key: HBASE-4192 > URL: https://issues.apache.org/jira/browse/HBASE-4192 > Project: HBase > Issue Type: New Feature > Components: wal >Affects Versions: 0.92.0 >Reporter: Vlad Dogaru >Priority: Minor > > Introduce a new HLog configuration parameter (batchEntries) for more > aggressive batching of appends. If this is enabled, HLog appends are not > written to the HLog writer immediately, but batched and written either > periodically or when a sync is requested. Because sync times become larger, > they use delayed RPCs to free up RPC handler threads. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10577) Remove unnecessary looping in FSHLog
[ https://issues.apache.org/jira/browse/HBASE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080372#comment-14080372 ] Sean Busbey commented on HBASE-10577: - I think this ticket is meant to be closed? > Remove unnecessary looping in FSHLog > > > Key: HBASE-10577 > URL: https://issues.apache.org/jira/browse/HBASE-10577 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 0.99.0 >Reporter: Himanshu Vashishtha > > In the new disruptor based FSHLog, the Syncer threads are handed a batch of > SyncFuture objects from the RingBufferHandler. The Syncer then invokes a sync > call on the current writer instance. > This handing of batch is done in serially in RingBufferHandler, that is, > every syncer receives a non overlapping batch of SyncFutures. Once synced, > Syncer thread updates highestSyncedSequence. > In the run method of Syncer, we have: > {code} > long currentHighestSyncedSequence = highestSyncedSequence.get(); > if (currentSequence < currentHighestSyncedSequence) { > syncCount += releaseSyncFuture(takeSyncFuture, > currentHighestSyncedSequence, null); > // Done with the 'take'. Go around again and do a new 'take'. > continue; > } > {code} > I find this logic of polling the BlockingQueue again in this condition > un-necessary. When the currentHighestSyncedSequence is already greater than > currentSequence, then doesn't it mean some other Syncer has already synced > SyncFuture of these ops ? And, we should just go ahead and release all the > SyncFutures for this batch to unblock the handlers. That would avoid polling > the Blockingqueue for all SyncFuture objects in this case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080367#comment-14080367 ] Hadoop QA commented on HBASE-11591: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658788/HBASE-11591.patch against trunk revision . ATTACHMENT ID: 12658788 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.regionserver.TestScannerWithBulkload org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.mapreduce.TestImportTSVWithVisibilityLabels org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestIOFencing {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10236//console This message is automatically generated. > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the ce
[jira] [Commented] (HBASE-11551) BucketCache$WriterThread.run() doesn't handle exceptions correctly
[ https://issues.apache.org/jira/browse/HBASE-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080363#comment-14080363 ] Hudson commented on HBASE-11551: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #403 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/403/]) HBASE-11551 BucketCache.run() doesn't handle exceptions correctly (Ted Yu) (tedyu: rev 76e89cb7fface4d91b8c62192832be581bf67a3b) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java > BucketCache$WriterThread.run() doesn't handle exceptions correctly > -- > > Key: HBASE-11551 > URL: https://issues.apache.org/jira/browse/HBASE-11551 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11551-v1.txt > > > Currently the catch is outside the while loop: > {code} > try { > while (cacheEnabled && writerEnabled) { > ... > } catch (Throwable t) { > LOG.warn("Failed doing drain", t); > } > {code} > When exception (e.g. BucketAllocatorException) is thrown, run() method would > terminate, silently. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080362#comment-14080362 ] Hudson commented on HBASE-11558: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #403 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/403/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev 61de4e47835f98dd7d2cec92bf33641c9de072a8) * hbase-protocol/src/main/protobuf/Client.proto * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11516) Track time spent in executing coprocessors in each region.
[ https://issues.apache.org/jira/browse/HBASE-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080354#comment-14080354 ] Andrew Purtell commented on HBASE-11516: Thank you very much for the patch and for being receptive to feedback [~srikanth235] > Track time spent in executing coprocessors in each region. > -- > > Key: HBASE-11516 > URL: https://issues.apache.org/jira/browse/HBASE-11516 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.98.4 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11516.patch, HBASE-11516_v2.patch, > HBASE-11516_v3.patch, HBASE-11516_v4.patch, region_server_webui.png, > rs_web_ui_v2.png > > > Currently, the time spent in executing coprocessors is not yet being tracked. > This feature can be handy for debugging coprocessors in case of any trouble. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080352#comment-14080352 ] Nick Dimiduk commented on HBASE-11558: -- Thanks, looks good. Go ahead and open the ticket. Should be deprecated on 0.98 and branch-1, removed from master. In addition to the deprecation annotations on Java methods, we should WARN when the config is used. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11516) Track time spent in executing coprocessors in each region.
[ https://issues.apache.org/jira/browse/HBASE-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080353#comment-14080353 ] Srikanth Srungarapu commented on HBASE-11516: - [~andrew.purt...@gmail.com] It would be great if you can fix it while committing :). Thanks so much for taking keen interest in this jira. > Track time spent in executing coprocessors in each region. > -- > > Key: HBASE-11516 > URL: https://issues.apache.org/jira/browse/HBASE-11516 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.98.4 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11516.patch, HBASE-11516_v2.patch, > HBASE-11516_v3.patch, HBASE-11516_v4.patch, region_server_webui.png, > rs_web_ui_v2.png > > > Currently, the time spent in executing coprocessors is not yet being tracked. > This feature can be handy for debugging coprocessors in case of any trouble. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11516) Track time spent in executing coprocessors in each region.
[ https://issues.apache.org/jira/browse/HBASE-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080347#comment-14080347 ] Andrew Purtell commented on HBASE-11516: +1 v4 patch looks good. There is a minor spelling error "Stastics" that would surface in metrics reports but I will fix that on commit. Or please feel free to upload a v5 patch with the fix. Ping [~enis] for branch-1. > Track time spent in executing coprocessors in each region. > -- > > Key: HBASE-11516 > URL: https://issues.apache.org/jira/browse/HBASE-11516 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.98.4 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11516.patch, HBASE-11516_v2.patch, > HBASE-11516_v3.patch, HBASE-11516_v4.patch, region_server_webui.png, > rs_web_ui_v2.png > > > Currently, the time spent in executing coprocessors is not yet being tracked. > This feature can be handy for debugging coprocessors in case of any trouble. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11616: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) > ... > Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No > server address listed in hbase:meta for region hbase:acl,,1376029204842. > 06dfcfc239196403c5f1135b91dedc64. containing row > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) > ... 31 more > {code} > The cause for the above error is that the _acl_ table contained in the image > (w.r.t. hbase:meta table) doesn't have server address. > [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080332#comment-14080332 ] Hadoop QA commented on HBASE-11621: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658757/11621-v1.txt against trunk revision . ATTACHMENT ID: 12658757 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10235//console This message is automatically generated. > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080334#comment-14080334 ] Andrew Purtell commented on HBASE-11621: +1 for 0.98. Ping [~enis] for branch-1 > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11621: --- Fix Version/s: 2.0.0 0.98.5 0.99.0 Assignee: Ted Yu > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080336#comment-14080336 ] Ted Yu commented on HBASE-11621: Here was 0.98 test suite with patch - on the same host as the previous run: {code} [INFO] HBase . SUCCESS [1.685s] [INFO] HBase - Common SUCCESS [20.706s] [INFO] HBase - Protocol .. SUCCESS [0.319s] [INFO] HBase - Client SUCCESS [36.487s] [INFO] HBase - Hadoop Compatibility .. SUCCESS [5.156s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [1.628s] [INFO] HBase - Prefix Tree ... SUCCESS [2.665s] [INFO] HBase - Server SUCCESS [1:06:39.770s] [INFO] HBase - Testing Util .. SUCCESS [1.812s] [INFO] HBase - Thrift SUCCESS [2:07.423s] [INFO] HBase - Shell . SUCCESS [2:34.072s] [INFO] HBase - Integration Tests . SUCCESS [0.920s] [INFO] HBase - Examples .. SUCCESS [5.471s] [INFO] HBase - Assembly .. SUCCESS [0.837s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1:12:39.613s [INFO] Finished at: Thu Jul 31 00:48:36 UTC 2014 {code} With patch, hbase-server module went from 66 min to 51 min. > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080325#comment-14080325 ] Andrew Purtell edited comment on HBASE-11620 at 7/31/14 1:01 AM: - Let's not forget the original sin of changing the WAL reader+writer implementation classes after a crash and before a restart. That cannot and should not be acceptable practice. [~tedyu], this could work if you can come up with a way for a codec to reliably tell the difference between EOF and corruption for some other reason. Just propagating EOF to the splitter seems contrary to current expected behavior. Edit: Or, deal with the actual user action here and consider a new optional field in the pbuf WAL header that carries the name of the class that wrote it. A reader can check if it can handle the output of that writer when the header is being read. The error at that point would be unambiguous. was (Author: apurtell): Let's not forget the original sin of changing the WAL reader+writer implementation classes after a crash and before a restart. That cannot and should not be acceptable practice. [~tedyu], this could work if you can come up with a way for a codec to reliably tell the difference between EOF and corruption for some other reason. Just propagating EOF to the splitter seems contrary to current expected behavior. > Propagate decoder exception to HLogSplitter so that loss of data is avoided > --- > > Key: HBASE-11620 > URL: https://issues.apache.org/jira/browse/HBASE-11620 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.4 >Reporter: Ted Yu >Priority: Critical > Attachments: 11620-v1.txt > > > Reported by Kiran in this thread: "HBase file encryption, inconsistencies > observed and data loss" > After step 4 ( i.e disabling of WAL encryption, removing > SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly > due to EOF exception at Basedecoder. This is not considered as error and > these WAL are being moved to /oldWALs. > Following is observed in log files: > {code} > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Splitting hlog: > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, > length=172 > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: DistributedLogReplay = false > 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: Recovering lease on dfs file > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: recoverLease=true, attempt=0 on > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > after 1ms > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting > 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting > 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: > Premature EOF from inputStream > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Finishing writing output logs and closing down. > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Waiting for split writer threads to finish > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Split writers finished > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Processed 0 edits across 0 regions; log > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > is corrupted = false progress failed = false > {code} > To fix this, we need to propagate EOF exception to HLogSplitter. Any > suggestions on the fix? > (end of quote from Kiran) > In BaseDecoder#rethrowEofException() : > {code} > if (!isEof) throw ioEx; > LOG.error("Partial cell read caused by EOF: " + ioEx); > EOFException eofEx = new EOFException("Partial cell read"); > eofEx.initCause(ioEx); > throw eofEx; > {code} > t
[jira] [Commented] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080325#comment-14080325 ] Andrew Purtell commented on HBASE-11620: Let's not forget the original sin of changing the WAL reader+writer implementation classes after a crash and before a restart. That cannot and should not be acceptable practice. [~tedyu], this could work if you can come up with a way for a codec to reliably tell the difference between EOF and corruption for some other reason. Just propagating EOF to the splitter seems contrary to current expected behavior. > Propagate decoder exception to HLogSplitter so that loss of data is avoided > --- > > Key: HBASE-11620 > URL: https://issues.apache.org/jira/browse/HBASE-11620 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.4 >Reporter: Ted Yu >Priority: Critical > Attachments: 11620-v1.txt > > > Reported by Kiran in this thread: "HBase file encryption, inconsistencies > observed and data loss" > After step 4 ( i.e disabling of WAL encryption, removing > SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly > due to EOF exception at Basedecoder. This is not considered as error and > these WAL are being moved to /oldWALs. > Following is observed in log files: > {code} > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Splitting hlog: > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, > length=172 > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: DistributedLogReplay = false > 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: Recovering lease on dfs file > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: recoverLease=true, attempt=0 on > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > after 1ms > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting > 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting > 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: > Premature EOF from inputStream > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Finishing writing output logs and closing down. > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Waiting for split writer threads to finish > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Split writers finished > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Processed 0 edits across 0 regions; log > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > is corrupted = false progress failed = false > {code} > To fix this, we need to propagate EOF exception to HLogSplitter. Any > suggestions on the fix? > (end of quote from Kiran) > In BaseDecoder#rethrowEofException() : > {code} > if (!isEof) throw ioEx; > LOG.error("Partial cell read caused by EOF: " + ioEx); > EOFException eofEx = new EOFException("Partial cell read"); > eofEx.initCause(ioEx); > throw eofEx; > {code} > throwing EOFException would not propagate the "Partial cell read" condition > to HLogSplitter which doesn't treat EOFException as an error. > I think IOException should be thrown above - HLogSplitter#getNextLogLine() > would translate the IOEx to CorruptedLogFileException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080324#comment-14080324 ] Hudson commented on HBASE-11558: FAILURE: Integrated in hbase-0.96 #414 (See [https://builds.apache.org/job/hbase-0.96/414/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev efdbe072ef7e910259360bfb01bc4200eab86a4f) * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/protobuf/Client.proto > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080314#comment-14080314 ] Alex Newman commented on HBASE-11567: - OK here is my first pass at it. I talked to stack offline about the jmock changes. I actually think we should break this up into three commits 1. JMock changes to pom 2. Protobuf changes (and generated files) 3. The actual changes But I'll do it as one patch first. > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Alex Newman > Attachments: HBASE-11567-v1.patch > > > Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and > region open/close (HBASE-11512) , we should persist bulk load events to WAL. > This is especially important for secondary region replicas, since we can use > this information to pick up primary regions' files from secondary replicas. > A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-11567: Attachment: HBASE-11567-v1.patch > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Alex Newman > Attachments: HBASE-11567-v1.patch > > > Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and > region open/close (HBASE-11512) , we should persist bulk load events to WAL. > This is especially important for secondary region replicas, since we can use > this information to pick up primary regions' files from secondary replicas. > A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11567) Write bulk load events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-11567: Status: Patch Available (was: Open) > Write bulk load events to WAL > - > > Key: HBASE-11567 > URL: https://issues.apache.org/jira/browse/HBASE-11567 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Alex Newman > Attachments: HBASE-11567-v1.patch > > > Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and > region open/close (HBASE-11512) , we should persist bulk load events to WAL. > This is especially important for secondary region replicas, since we can use > this information to pick up primary regions' files from secondary replicas. > A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Release Note: TableMapReduceUtil now restores the option to set scanner caching by setting it on the scanner object. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080304#comment-14080304 ] Ishan Chhabra commented on HBASE-11558: --- Updated release notes. It makes sense to remove the second method. Do you propose to delete the method or mark it as deprecated for now? Which branches should get this patch? I can open a separate JIRA and put in the patch there once the answers are clear. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Release Note: TableMapReduceUtil now restores the option to set scanner caching by setting it on the Scan object that is passe in. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. was: TableMapReduceUtil now restores the option to set scanner caching by setting it on the scanner object. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080270#comment-14080270 ] Jonathan Hsieh commented on HBASE-11609: It isn't obvious why it took us this long to catch this bug. Reviewing the code it is because it only affects the case when we have a bulk load of a data that spans multiple regions (e.g. there was a split) and the temp names would then include the illegal ':' chars. Could we add comments in the code about this subtlety? > LoadIncrementalHFiles fails if the namespace is specified > - > > Key: HBASE-11609 > URL: https://issues.apache.org/jira/browse/HBASE-11609 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.0.0, 0.98.4, 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 1.0.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch > > > from Jianshi Huang on the user list > trying to bulk load a table in a namespace, like: > $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ > foo:testtb > we get an exception > {code} > 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: > Unexpected execution exception during splitting > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > java.net.URISyntaxException: Relative path in absolute URI: > foo:testtb,1.bottom > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > {code} > The problem is related to the ':' symbol going to the file path. the simple > fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080269#comment-14080269 ] Demai Ni commented on HBASE-11617: -- don't think the failed testcases are related with this patch. the same failures also show up in other jiras from recent testing > incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics > when no new replication OP > -- > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11438) [Visibility Controller] Support UTF8 character as Visibility Labels
[ https://issues.apache.org/jira/browse/HBASE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11438: --- Status: Patch Available (was: Open) > [Visibility Controller] Support UTF8 character as Visibility Labels > --- > > Key: HBASE-11438 > URL: https://issues.apache.org/jira/browse/HBASE-11438 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 0.98.4 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.98.5 > > Attachments: HBASE-11438_v1.patch, HBASE-11438_v2.patch, > HBASE-11438_v3.patch, HBASE-11438_v4.patch, HBASE-11438_v5.patch > > > This would be an action item that we would be addressing so that the > visibility labels could have UTF8 characters in them. Also allow the user to > use a client supplied API that allows to specify the visibility labels inside > double quotes such that UTF8 characters and cases like &, |, ! and double > quotes itself could be specified with proper escape sequence. Accumulo too > provides one such API in the client side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11591: --- Attachment: HBASE-11591.patch Attaching a patch to get feedback. Checking on some more corner cases. > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11438) [Visibility Controller] Support UTF8 character as Visibility Labels
[ https://issues.apache.org/jira/browse/HBASE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11438: --- Status: Open (was: Patch Available) > [Visibility Controller] Support UTF8 character as Visibility Labels > --- > > Key: HBASE-11438 > URL: https://issues.apache.org/jira/browse/HBASE-11438 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 0.98.4 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.98.5 > > Attachments: HBASE-11438_v1.patch, HBASE-11438_v2.patch, > HBASE-11438_v3.patch, HBASE-11438_v4.patch, HBASE-11438_v5.patch > > > This would be an action item that we would be addressing so that the > visibility labels could have UTF8 characters in them. Also allow the user to > use a client supplied API that allows to specify the visibility labels inside > double quotes such that UTF8 characters and cases like &, |, ! and double > quotes itself could be specified with proper escape sequence. Accumulo too > provides one such API in the client side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080229#comment-14080229 ] ramkrishna.s.vasudevan commented on HBASE-11620: bq.I think a new exception type (DecoderException e.g.) should be used above. No it cannot be. I think that EOF case was added for some specific cases where there is a real file with no entry. > Propagate decoder exception to HLogSplitter so that loss of data is avoided > --- > > Key: HBASE-11620 > URL: https://issues.apache.org/jira/browse/HBASE-11620 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.4 >Reporter: Ted Yu >Priority: Critical > Attachments: 11620-v1.txt > > > Reported by Kiran in this thread: "HBase file encryption, inconsistencies > observed and data loss" > After step 4 ( i.e disabling of WAL encryption, removing > SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly > due to EOF exception at Basedecoder. This is not considered as error and > these WAL are being moved to /oldWALs. > Following is observed in log files: > {code} > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Splitting hlog: > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, > length=172 > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: DistributedLogReplay = false > 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: Recovering lease on dfs file > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: recoverLease=true, attempt=0 on > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > after 1ms > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting > 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting > 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: > Premature EOF from inputStream > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Finishing writing output logs and closing down. > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Waiting for split writer threads to finish > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Split writers finished > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Processed 0 edits across 0 regions; log > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > is corrupted = false progress failed = false > {code} > To fix this, we need to propagate EOF exception to HLogSplitter. Any > suggestions on the fix? > (end of quote from Kiran) > In BaseDecoder#rethrowEofException() : > {code} > if (!isEof) throw ioEx; > LOG.error("Partial cell read caused by EOF: " + ioEx); > EOFException eofEx = new EOFException("Partial cell read"); > eofEx.initCause(ioEx); > throw eofEx; > {code} > throwing EOFException would not propagate the "Partial cell read" condition > to HLogSplitter which doesn't treat EOFException as an error. > I think IOException should be thrown above - HLogSplitter#getNextLogLine() > would translate the IOEx to CorruptedLogFileException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11591: --- Status: Patch Available (was: Open) > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, TestBulkload.java > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10834) Better error messaging on issuing grant commands in non-authz mode
[ https://issues.apache.org/jira/browse/HBASE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080195#comment-14080195 ] Hadoop QA commented on HBASE-10834: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658772/HBASE-10834_v3.patch against trunk revision . ATTACHMENT ID: 12658772 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10234//console This message is automatically generated. > Better error messaging on issuing grant commands in non-authz mode > -- > > Key: HBASE-10834 > URL: https://issues.apache.org/jira/browse/HBASE-10834 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.17 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Trivial > Attachments: HBASE-10834.patch, HBASE-10834_v2.patch, > HBASE-10834_v3.patch > > > Running the below sequence of steps should give a better error messaging > rather than "table not found" error. > {code} > hbase(main):009:0> grant "test", "RWCXA" > ERROR: Unknown table _acl_! > Here is some help for this command: > Grant users specific rights. > Syntax : grant [ [ [ qualifier>]] > permissions is either zero or more letters from the set "RWXCA". > READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') > For example: > hbase> grant 'bobsmith', 'RWXCA' > hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1' > {code} > Instead of ERROR: Unknown table _acl_!, hbase should give out a warning like > "Command not supported in non-authz mode(as acl table is only created if > authz is turned on)" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080193#comment-14080193 ] Ted Yu commented on HBASE-11621: Test suite for 0.98 passed with patch: {code} [INFO] HBase . SUCCESS [1.834s] [INFO] HBase - Common SUCCESS [23.395s] [INFO] HBase - Protocol .. SUCCESS [0.278s] [INFO] HBase - Client SUCCESS [37.698s] [INFO] HBase - Hadoop Compatibility .. SUCCESS [5.195s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [1.769s] [INFO] HBase - Prefix Tree ... SUCCESS [2.986s] [INFO] HBase - Server SUCCESS [51:21.833s] [INFO] HBase - Testing Util .. SUCCESS [1.301s] [INFO] HBase - Thrift SUCCESS [1:56.640s] [INFO] HBase - Shell . SUCCESS [1:32.877s] [INFO] HBase - Integration Tests . SUCCESS [1.012s] [INFO] HBase - Examples .. SUCCESS [5.685s] [INFO] HBase - Assembly .. SUCCESS [0.812s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 56:14.023s [INFO] Finished at: Wed Jul 30 23:28:25 UTC 2014 {code} > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11621: --- Status: Patch Available (was: Open) > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10834) Better error messaging on issuing grant commands in non-authz mode
[ https://issues.apache.org/jira/browse/HBASE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Srungarapu updated HBASE-10834: Attachment: HBASE-10834_v3.patch Sorry for overlooking the case where authorization is turned on. Made changes to the patch, which now issues the error message, after ensuring that no _acl_ table exists. > Better error messaging on issuing grant commands in non-authz mode > -- > > Key: HBASE-10834 > URL: https://issues.apache.org/jira/browse/HBASE-10834 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.17 >Reporter: Srikanth Srungarapu >Assignee: Srikanth Srungarapu >Priority: Trivial > Attachments: HBASE-10834.patch, HBASE-10834_v2.patch, > HBASE-10834_v3.patch > > > Running the below sequence of steps should give a better error messaging > rather than "table not found" error. > {code} > hbase(main):009:0> grant "test", "RWCXA" > ERROR: Unknown table _acl_! > Here is some help for this command: > Grant users specific rights. > Syntax : grant [ [ [ qualifier>]] > permissions is either zero or more letters from the set "RWXCA". > READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') > For example: > hbase> grant 'bobsmith', 'RWXCA' > hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1' > {code} > Instead of ERROR: Unknown table _acl_!, hbase should give out a warning like > "Command not supported in non-authz mode(as acl table is only created if > authz is turned on)" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080177#comment-14080177 ] Hudson commented on HBASE-11558: FAILURE: Integrated in HBase-TRUNK #5355 (See [https://builds.apache.org/job/HBase-TRUNK/5355/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev 50ac59fa8530bbd35c21cd61cfd64d2bd7d3eb57) * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/protobuf/Client.proto * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080161#comment-14080161 ] Hadoop QA commented on HBASE-11617: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658745/HBASE-11617-master-v1.patch against trunk revision . ATTACHMENT ID: 12658745 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.migration.TestNamespaceUpgrade Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10233//console This message is automatically generated. > incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics > when no new replication OP > -- > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occ
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080158#comment-14080158 ] Hudson commented on HBASE-11558: SUCCESS: Integrated in HBase-0.98 #425 (See [https://builds.apache.org/job/HBase-0.98/425/]) HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ (Ishan Chhabra) (ndimiduk: rev 61de4e47835f98dd7d2cec92bf33641c9de072a8) * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/protobuf/Client.proto * hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080129#comment-14080129 ] Hadoop QA commented on HBASE-4744: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658744/HBASE_4744-v2.patch against trunk revision . ATTACHMENT ID: 12658744 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestRegionPlacement org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.TestIOFencing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//console This message is automatically generated. > Remove @Ignore for testLogRollAfterSplitStart > - > > Key: HBASE-4744 > URL: https://issues.apache.org/jira/browse/HBASE-4744 > Project: HBase > Issue Type: Test >Affects Versions: 0.94.0 >Reporter: Nicolas Spiegelberg >Priority: Critical > Labels: newbie > Attachments: HBASE_4744-v2.patch, HBASE_4744.patch > > > We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to > HDFS. Although a number of HDFS versions have this fix, the official HDFS > 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. > Please revisit before the RC of 0.94, which should have 0.20.205.1 or later & > the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080119#comment-14080119 ] Jimmy Xiang commented on HBASE-11616: - Thought about removing it. Probably will do it in HBASE-11611. > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) > ... > Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No > server address listed in hbase:meta for region hbase:acl,,1376029204842. > 06dfcfc239196403c5f1135b91dedc64. containing row > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) > ... 31 more > {code} > The cause for the above error is that the _acl_ table contained in the image > (w.r.t. hbase:meta table) doesn't have server address. > [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated into master and branch-1. > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080113#comment-14080113 ] stack commented on HBASE-4744: -- Patch looks good to me. Will wait on hadoopqa. Get your reason for moving the test. Makes sense. Up to you. I could commit v2 or wait on a v3 where you move it. Good on you Sean. > Remove @Ignore for testLogRollAfterSplitStart > - > > Key: HBASE-4744 > URL: https://issues.apache.org/jira/browse/HBASE-4744 > Project: HBase > Issue Type: Test >Affects Versions: 0.94.0 >Reporter: Nicolas Spiegelberg >Priority: Critical > Labels: newbie > Attachments: HBASE_4744-v2.patch, HBASE_4744.patch > > > We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to > HDFS. Although a number of HDFS versions have this fix, the official HDFS > 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. > Please revisit before the RC of 0.94, which should have 0.20.205.1 or later & > the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080107#comment-14080107 ] stack commented on HBASE-11616: --- Why not remove TestNamespaceUpgrade in trunk (and all code associated with namespace upgrades?) We don't need it anymore? Otherwise +1 on patch for now. > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) > ... > Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No > server address listed in hbase:meta for region hbase:acl,,1376029204842. > 06dfcfc239196403c5f1135b91dedc64. containing row > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) > ... 31 more > {code} > The cause for the above error is that the _acl_ table contained in the image > (w.r.t. hbase:meta table) doesn't have server address. > [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080103#comment-14080103 ] Ted Yu commented on HBASE-11616: lgtm > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) > ... > Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No > server address listed in hbase:meta for region hbase:acl,,1376029204842. > 06dfcfc239196403c5f1135b91dedc64. containing row > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) > ... 31 more > {code} > The cause for the above error is that the _acl_ table contained in the image > (w.r.t. hbase:meta table) doesn't have server address. > [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080095#comment-14080095 ] stack commented on HBASE-11615: --- +1 Failures are apache infra related I believe. > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11621) Make MiniDFSCluster run faster
Ted Yu created HBASE-11621: -- Summary: Make MiniDFSCluster run faster Key: HBASE-11621 URL: https://issues.apache.org/jira/browse/HBASE-11621 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: 11621-v1.txt Daryn proposed the following change in HDFS-6773: {code} EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); {code} With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11621: --- Attachment: 11621-v1.txt Tentative patch. Running test suite to see which test(s) break. > Make MiniDFSCluster run faster > -- > > Key: HBASE-11621 > URL: https://issues.apache.org/jira/browse/HBASE-11621 > Project: HBase > Issue Type: Task >Reporter: Ted Yu > Attachments: 11621-v1.txt > > > Daryn proposed the following change in HDFS-6773: > {code} > EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); > {code} > With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for > TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080050#comment-14080050 ] Hadoop QA commented on HBASE-11615: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658742/hbase-11615.patch against trunk revision . ATTACHMENT ID: 12658742 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.procedure.TestProcedureManager org.apache.hadoop.hbase.ipc.TestIPC org.apache.hadoop.hbase.master.TestClockSkewDetection Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//console This message is automatically generated. > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-3270) When we create the .version file, we should create it in a tmp location and then move it into place
[ https://issues.apache.org/jira/browse/HBASE-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-3270: --- Labels: newbie (was: ) > When we create the .version file, we should create it in a tmp location and > then move it into place > --- > > Key: HBASE-3270 > URL: https://issues.apache.org/jira/browse/HBASE-3270 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: stack >Priority: Minor > Labels: newbie > Fix For: 0.99.0, 0.98.5, 2.0.0 > > > Todd suggests over in HBASE-3258 that writing hbase.version, we should write > it off in a /tmp location and then move it into place after writing it to > protect against case where file writer crashes between creation and write. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080028#comment-14080028 ] Hadoop QA commented on HBASE-11616: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658727/hbase-11616.patch against trunk revision . ATTACHMENT ID: 12658727 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//console This message is automatically generated. > TestNamespaceUpgrade fails in trunk > --- > > Key: HBASE-11616 > URL: https://issues.apache.org/jira/browse/HBASE-11616 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Jimmy Xiang > Fix For: 2.0.0 > > Attachments: hbase-11616.patch > > > I see the following in test output: > {code} > type="org.apache.hadoop.hbase.client.RetriesExhaustedException">org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) > at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) > at > org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBe
[jira] [Commented] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080026#comment-14080026 ] Demai Ni commented on HBASE-11617: -- btw, with this patch, I am not sure what the purpose of MetricsSink.refreshAgeOfLastAppliedOp() ? As it will be ignored and always return age = 0; > incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics > when no new replication OP > -- > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11143) Improve replication metrics
[ https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080018#comment-14080018 ] Demai Ni commented on HBASE-11143: -- thanks to [~lhofhansl]'s suggestion, the patch is uploaded in [HBASE-11617 | https://issues.apache.org/jira/browse/HBASE-11617] > Improve replication metrics > --- > > Key: HBASE-11143 > URL: https://issues.apache.org/jira/browse/HBASE-11143 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.99.0, 0.94.20, 0.98.3 > > Attachments: 11143-0.94-v2.txt, 11143-0.94-v3.txt, 11143-0.94.txt, > 11143-trunk.txt > > > We are trying to report on replication lag and find that there is no good > single metric to do that. > ageOfLastShippedOp is close, but unfortunately it is increased even when > there is nothing to ship on a particular RegionServer. > I would like discuss a few options here: > Add a new metric: replicationQueueTime (or something) with the above meaning. > I.e. if we have something to ship we set the age of that last shipped edit, > if we fail we increment that last time (just like we do now). But if there is > nothing to replicate we set it to current time (and hence that metric is > reported to close to 0). > Alternatively we could change the meaning of ageOfLastShippedOp to mean to do > that. That might lead to surprises, but the current behavior is clearly weird > when there is nothing to replicate. > Comments? [~jdcryans], [~stack]. > If approach sounds good, I'll make a patch for all branches. > Edit: Also adds a new shippedKBs metric to track the amount of data that is > shipped via replication. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Status: Patch Available (was: In Progress) [~lhofhansl], would you please take a look at the patch, whether it matches your take? thanks > incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics > when no new replication OP > -- > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11620: --- Affects Version/s: 0.98.4 > Propagate decoder exception to HLogSplitter so that loss of data is avoided > --- > > Key: HBASE-11620 > URL: https://issues.apache.org/jira/browse/HBASE-11620 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.4 >Reporter: Ted Yu >Priority: Critical > Attachments: 11620-v1.txt > > > Reported by Kiran in this thread: "HBase file encryption, inconsistencies > observed and data loss" > After step 4 ( i.e disabling of WAL encryption, removing > SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly > due to EOF exception at Basedecoder. This is not considered as error and > these WAL are being moved to /oldWALs. > Following is observed in log files: > {code} > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Splitting hlog: > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, > length=172 > 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: DistributedLogReplay = false > 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: Recovering lease on dfs file > hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > util.FSHDFSUtils: recoverLease=true, attempt=0 on > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > after 1ms > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting > 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting > 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] > wal.HLogSplitter: Writer thread > Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting > 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: > Premature EOF from inputStream > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Finishing writing output logs and closing down. > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Waiting for split writer threads to finish > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Split writers finished > 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] > wal.HLogSplitter: Processed 0 edits across 0 regions; log > file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 > is corrupted = false progress failed = false > {code} > To fix this, we need to propagate EOF exception to HLogSplitter. Any > suggestions on the fix? > (end of quote from Kiran) > In BaseDecoder#rethrowEofException() : > {code} > if (!isEof) throw ioEx; > LOG.error("Partial cell read caused by EOF: " + ioEx); > EOFException eofEx = new EOFException("Partial cell read"); > eofEx.initCause(ioEx); > throw eofEx; > {code} > throwing EOFException would not propagate the "Partial cell read" condition > to HLogSplitter which doesn't treat EOFException as an error. > I think IOException should be thrown above - HLogSplitter#getNextLogLine() > would translate the IOEx to CorruptedLogFileException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Summary: incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP (was: AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP ) > incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics > when no new replication OP > -- > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Attachment: HBASE-11617-master-v1.patch upload the patch for both AgeOfLastAppliedOp and AgeOfLatShippedOp(from [HBase-11143 | https://issues.apache.org/jira/browse/HBASE-11143] ) > AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink > OP > > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE-11617-master-v1.patch > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-11617 started by Demai Ni. > AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink > OP > > > Key: HBASE-11617 > URL: https://issues.apache.org/jira/browse/HBASE-11617 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.2 >Reporter: Demai Ni >Assignee: Demai Ni >Priority: Minor > Fix For: 0.99.0, 0.98.5, 2.0.0 > > > AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in > the 'replication queue' before it got replicated(aka applied) > {code} > /** >* Set the age of the last applied operation >* >* @param timestamp The timestamp of the last operation applied. >* @return the age that was set >*/ > public long setAgeOfLastAppliedOp(long timestamp) { > lastTimestampForAge = timestamp; > long age = System.currentTimeMillis() - lastTimestampForAge; > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; > } > {code} > In the following scenario: > 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > set for example 100ms; > 2) and then NO new Sink op occur. > 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, > It was because that refreshAgeOfLastAppliedOp() get invoked periodically by > getStats(). > proposed fix: > {code} > --- > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > +++ > hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java > @@ -35,6 +35,7 @@ public class MetricsSink { > >private MetricsReplicationSource rms; >private long lastTimestampForAge = System.currentTimeMillis(); > + private long age = 0; > >public MetricsSink() { > rms = > CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); > @@ -47,8 +48,12 @@ public class MetricsSink { > * @return the age that was set > */ >public long setAgeOfLastAppliedOp(long timestamp) { > -lastTimestampForAge = timestamp; > -long age = System.currentTimeMillis() - lastTimestampForAge; > +if (lastTimestampForAge != timestamp) { > + lastTimestampForAge = timestamp; > + this.age = System.currentTimeMillis() - lastTimestampForAge; > +} else { > + this.age = 0; > +} > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > return age; >} > {code} > detail discussion in [dev@hbase | > http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E > ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-4744: --- Attachment: HBASE_4744-v2.patch missed an instance of trailing whitespace in the first patch. this one should be ready for review. > Remove @Ignore for testLogRollAfterSplitStart > - > > Key: HBASE-4744 > URL: https://issues.apache.org/jira/browse/HBASE-4744 > Project: HBase > Issue Type: Test >Affects Versions: 0.94.0 >Reporter: Nicolas Spiegelberg >Priority: Critical > Labels: newbie > Attachments: HBASE_4744-v2.patch, HBASE_4744.patch > > > We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to > HDFS. Although a number of HDFS versions have this fix, the official HDFS > 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. > Please revisit before the RC of 0.94, which should have 0.20.205.1 or later & > the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Attachment: hbase-11615.patch > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Fix Version/s: 2.0.0 1.0.0 Status: Patch Available (was: Open) > TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins > --- > > Key: HBASE-11615 > URL: https://issues.apache.org/jira/browse/HBASE-11615 > Project: HBase > Issue Type: Test > Components: master >Reporter: Mike Drob >Assignee: Jimmy Xiang > Fix For: 1.0.0, 2.0.0 > > Attachments: hbase-11615.patch > > > Failed on branch-1. > Example Failure: > https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-4744: --- Status: Patch Available (was: Open) > Remove @Ignore for testLogRollAfterSplitStart > - > > Key: HBASE-4744 > URL: https://issues.apache.org/jira/browse/HBASE-4744 > Project: HBase > Issue Type: Test >Affects Versions: 0.94.0 >Reporter: Nicolas Spiegelberg >Priority: Critical > Labels: newbie > Attachments: HBASE_4744.patch > > > We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to > HDFS. Although a number of HDFS versions have this fix, the official HDFS > 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. > Please revisit before the RC of 0.94, which should have 0.20.205.1 or later & > the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10674) HBCK should be updated to do replica related checks
[ https://issues.apache.org/jira/browse/HBASE-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079994#comment-14079994 ] Hadoop QA commented on HBASE-10674: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658715/10674-1.2.txt against trunk revision . ATTACHMENT ID: 12658715 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.TestIOFencing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//console This message is automatically generated. > HBCK should be updated to do replica related checks > --- > > Key: HBASE-10674 > URL: https://issues.apache.org/jira/browse/HBASE-10674 > Project: HBase > Issue Type: Sub-task >Reporter: Devaraj Das >Assignee: Devaraj Das > Attachments: 10674-1.2.txt, 10674-1.txt > > > HBCK should be updated to have a check for whether the replicas are assigned > to the right machines (default and non-default replicas ideally should not be > in the same server if there is more than one server in the cluster and such > scenarios). [~jmhsieh] suggested this in HBASE-10362. -- This message was sent by Atlassian JIRA (v6.2#6252)