[jira] [Commented] (HBASE-9295) Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator
[ https://issues.apache.org/jira/browse/HBASE-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816414#comment-13816414 ] Ted Yu commented on HBASE-9295: --- Integrated to trunk. Thanks for the review, Jesse. Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator - Key: HBASE-9295 URL: https://issues.apache.org/jira/browse/HBASE-9295 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.0 Attachments: 9295-v1.txt, 9295-v2.txt There were two recent bug fixes (HBASE-9285 and HBASE-9238) for the case where the TreeMap keyed by byte[] doesn't use proper comparator: {code} new TreeMapbyte[], ...() {code} test-patch.sh should be able to detect this situation and report accordingly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816419#comment-13816419 ] Gustavo Anatoly commented on HBASE-9808: Now, I can see white spaces. I will fix it. Thanks. org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, HBASE-9808-v3.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-3787) Increment is non-idempotent but client retries RPC
[ https://issues.apache.org/jira/browse/HBASE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816423#comment-13816423 ] Sergey Shelukhin commented on HBASE-3787: - yeah, I updated https://reviews.apache.org/r/10965/ with latest patch Increment is non-idempotent but client retries RPC -- Key: HBASE-3787 URL: https://issues.apache.org/jira/browse/HBASE-3787 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.4, 0.95.2 Reporter: dhruba borthakur Assignee: Sergey Shelukhin Priority: Blocker Attachments: HBASE-3787-partial.patch, HBASE-3787-v0.patch, HBASE-3787-v1.patch, HBASE-3787-v2.patch, HBASE-3787-v3.patch, HBASE-3787-v4.patch, HBASE-3787-v5.patch, HBASE-3787-v5.patch, HBASE-3787-v6.patch, HBASE-3787-v7.patch, HBASE-3787-v8.patch The HTable.increment() operation is non-idempotent. The client retries the increment RPC a few times (as specified by configuration) before throwing an error to the application. This makes it possible that the same increment call be applied twice at the server. For increment operations, is it better to use HConnectionManager.getRegionServerWithoutRetries()? Another option would be to enhance the IPC module to make the RPC server correctly identify if the RPC is a retry attempt and handle accordingly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-9818: --- Attachment: trunk-9818.patch Attached trunk-9818.patch which makes sure all scanners are closed if a store file is closed. NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Ted Yu Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat} --
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-9818: --- Status: Patch Available (was: Open) NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Ted Yu Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816460#comment-13816460 ] Gary Helmling commented on HBASE-9890: -- Reviewed v3. In mapred.TableMapReduceUtil, looks like there is a slight regression from v2: {code} public static void initCredentials(JobConf job) throws IOException { UserProvider userProvider = UserProvider.instantiate(job); -// login the server principal (if using secure Hadoop) if (userProvider.isHBaseSecurityEnabled()) { + // propagate delegation related props from launcher job to MR job + if (System.getenv(HADOOP_TOKEN_FILE_LOCATION) != null) { +job.set(mapreduce.job.credentials.binary, System.getenv(HADOOP_TOKEN_FILE_LOCATION)); + } {code} Shouldn't the check and use of HADOOP_TOKEN_FILE_LOCATION be pulled out of the userProvider.isHBaseSecurityEnabled() block and conditioned on userProvider.isHadoopSecurityEnabled()? The same applies to mapreduce.TableMapReduceUtil.initCredentials() though I didn't spot the issue there in earlier versions. Let me know if I somehow have that wrong. The rest of the patch looks good. MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Attachment: (was: 9915-0.94-v2.txt) Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Status: Open (was: Patch Available) Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816463#comment-13816463 ] Lars Hofhansl commented on HBASE-9915: -- Checked again. My first patch should be safer. block == null should be equivalent to seeker == null, but at other places we change block != null, so it's better to do this here as well. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9905) Enable using seqId as timestamp
[ https://issues.apache.org/jira/browse/HBASE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816464#comment-13816464 ] Sergey Shelukhin commented on HBASE-9905: - My 2c. I think relying on clock for any notion of consistency and versioning is an extremely bad idea, unless you are Spanner :). We already have loads of potential timing issues in HBase with the ts that we are using. System clock can move backwards, and skew between servers when region is moved/fails over can be significant, leading to all kinds of weird behavior. So I agree with Enis we should deprecate it. I don't think we need so many modes though. There are 3 ts sources - system-always, user-always, mixed. For system-always (rejecting client timestamps), I am +1 on seqNum-mode, as discussed in the meetup, and above. Maybe make it default for new tables in 98-1.0? And migrate meta to it, probably out of bounds with upgrade via a script, so it would be optional. For the others, I think /server/ should *never* use clock for ts. Never-ever. For user-always, it's not a problem by definition; for mixed mode, we /could use/ seqNum-as-ts (client can make derived ts-es that [~jmspaggi] mentions based on seqNum-ts, just as well as based on clock-ts; if client wants to store time as data (not version), a column is a proper place for it), but unfortunately due to backward compat it is not viable to make a complete switch. So what we should do imho is have client library, instead of server, generate TS if user doesn't supply one (there'd have to be some relatively easy backward compat). That way, the semantics are the same, but if the user screws up clocks, it's no longer an HBase consistency problem - if you want bulletproof clock-based versions (ts-es), manage your own, or manage your clock sync. For user-always vs mixed in this case, there's no need for special flag, from server point of view they look the same. Optionally we might want to create mixed mode with seqNums; or a special mode that would enforce user-always and reject requests without ts, but I don't think this is strictly necessary. If this is contentious I am +1 on just adding seqNum mode and not doing anything else also :) Enable using seqId as timestamp Key: HBASE-9905 URL: https://issues.apache.org/jira/browse/HBASE-9905 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Fix For: 0.98.0 This has been discussed previously, and Lars H. was mentioning an idea from the client to declare whether timestamps are used or not explicitly. The problem is that, for data models not using timestamps, we are still relying on clocks to order the updates. Clock skew, same milisecond puts after deletes, etc can cause unexpected behavior and data not being visible. We should have a table descriptor / family property, which would declare that the data model does not use timestamps. Then we can populate this dimension with the seqId, so that global ordering of edits are not effected by wall clock. For example, META will use this. Once we have something like this, we can think of making it default for new tables, so that the unknowing user will not shoot herself in the foot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816465#comment-13816465 ] Sergey Shelukhin commented on HBASE-9915: - looks good! Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-9890: --- Attachment: HBASE-9890-v4.patch MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9904) Solve skipping data in HTable scans
[ https://issues.apache.org/jira/browse/HBASE-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816472#comment-13816472 ] Hadoop QA commented on HBASE-9904: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612671/TestScanRetries.java against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7781//console This message is automatically generated. Solve skipping data in HTable scans --- Key: HBASE-9904 URL: https://issues.apache.org/jira/browse/HBASE-9904 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.89-fb Reporter: Manukranth Kolloju Priority: Critical Fix For: 0.89-fb, 0.98.0 Attachments: TestScanRetries.java, scan.diff The HTable client cannot retry a scan operation in the getRegionServerWithRetries code path. This will result in the client missing data. This can be worked around using hbase.client.retries.number to 1. The whole problem is that Callable knows nothing about retries and the protocol it dances to as well doesn't support retires. This fix will keep Callable protocol (ugly thing worth merciless refactoring) intact but will change ScannerCallable to anticipate retries. What we want is to make failed operations to be identities for outside world: N1 , N2 , F3 , N3 , F4 , F4 , N4 ... = N1 , N2 , N3 , N4 ... where Nk are successful operation and Fk are failed operations. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Attachment: 9915-trunk-v2.txt Trunk patch that again checks for block != null instead of seeker != null (in according with EncodedScannerV2.assertValidSeek(), which also verifies the block (not the seeker). Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816463#comment-13816463 ] Lars Hofhansl edited comment on HBASE-9915 at 11/7/13 9:58 PM: --- Checked again. My first patch should be safer. block == null should be equivalent to seeker == null, but at other places we check block != null, so it's better to do this here as well. was (Author: lhofhansl): Checked again. My first patch should be safer. block == null should be equivalent to seeker == null, but at other places we change block != null, so it's better to do this here as well. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Status: Patch Available (was: Open) Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816506#comment-13816506 ] Gary Helmling commented on HBASE-9890: -- +1 on v4 patch. Let's run it through HadoopQA one more time. MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-9890: - Status: Patch Available (was: Open) MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.96.0, 0.94.12, 0.98.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-9890: - Status: Open (was: Patch Available) MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.96.0, 0.94.12, 0.98.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9904) Solve skipping data in HTable scans
[ https://issues.apache.org/jira/browse/HBASE-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9904: - Priority: Major (was: Critical) Fix Version/s: (was: 0.98.0) Thanks for taking a look [~jxiang] Solve skipping data in HTable scans --- Key: HBASE-9904 URL: https://issues.apache.org/jira/browse/HBASE-9904 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.89-fb Reporter: Manukranth Kolloju Fix For: 0.89-fb Attachments: TestScanRetries.java, scan.diff The HTable client cannot retry a scan operation in the getRegionServerWithRetries code path. This will result in the client missing data. This can be worked around using hbase.client.retries.number to 1. The whole problem is that Callable knows nothing about retries and the protocol it dances to as well doesn't support retires. This fix will keep Callable protocol (ugly thing worth merciless refactoring) intact but will change ScannerCallable to anticipate retries. What we want is to make failed operations to be identities for outside world: N1 , N2 , F3 , N3 , F4 , F4 , N4 ... = N1 , N2 , N3 , N4 ... where Nk are successful operation and Fk are failed operations. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816773#comment-13816773 ] Nick Dimiduk commented on HBASE-9808: - I just use whitespace mode in emacs ;) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, HBASE-9808-v3.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9916) Fix javadoc warning in StoreFileManager.java
Ted Yu created HBASE-9916: - Summary: Fix javadoc warning in StoreFileManager.java Key: HBASE-9916 URL: https://issues.apache.org/jira/browse/HBASE-9916 Project: HBase Issue Type: Task Reporter: Ted Yu Priority: Minor From https://builds.apache.org/job/PreCommit-HBASE-Build/7779/artifact/trunk/patchprocess/patchJavadocWarnings.txt : {code} [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java:53: warning - @param argument sf is not a parameter name. {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816788#comment-13816788 ] Enis Soztutar commented on HBASE-9906: -- bq. Can the 20 ms sleep start counting from the call to MetaEditor.deleteRegions() ? I though about doing this inside RestoreSnapshotHelper, but encapsulating the whole thing in MetaEditor.overwriteRegions() seems cleaner, and other there might be other users for overwriting region data in meta. bq. Would 17ms sleep be good enough ? Let's keep some buffer. Thanks for the reviews. Restore snapshot fails to restore the meta edits sporadically --- Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: New Feature Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch After snaphot restore, we see failures to find the table in meta: {code} disable 'tablefour' restore_snapshot 'snapshot_tablefour' enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED = fa41edf43fe3ee131db4a34b848ff432, NAME = 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY = '', ENDKEY = ''}, {ENCODED = 76d0e2b7ec3291afcaa82e18a56ccc30, NAME = 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816809#comment-13816809 ] Andrew Purtell commented on HBASE-9915: --- Wow Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9913) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9913: Labels: (was: patch) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException Key: HBASE-9913 URL: https://issues.apache.org/jira/browse/HBASE-9913 Project: HBase Issue Type: Bug Components: hadoop2, mapreduce Affects Versions: 0.94.10 Environment: weblogic windows Reporter: 刘泓 Attachments: TableMapReduceUtil.class, TableMapReduceUtil.java java.lang.NullPointerException at java.io.File.init(File.java:222) at java.util.zip.ZipFile.init(ZipFile.java:75) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:144) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:221) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:87) at com.easymap.ezserver6.map.source.hbase.convert.HBaseMapMerge.beginMerge(HBaseMapMerge.java:163) at com.easymap.ezserver6.app.servlet.EzMapToHbaseService.doPost(EzMapToHbaseService.java:32) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227) at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175) at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3594) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121) at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2202) at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2108) at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1432) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201) at weblogic.work.ExecuteThread.run(ExecuteThread.java:173) by respective the hbase source code under weblogic,we found TableMapReduceUtil.findOrCreateJar returned string is null.because under tomcat ,jar file's protocol is jar type, but under weblogic ,jar file's protocol is zip type,and the findOrCreateJar method cann't resolve zip type, so we should join zip type judgement -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9913) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9913: Fix Version/s: (was: 0.94.10) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException Key: HBASE-9913 URL: https://issues.apache.org/jira/browse/HBASE-9913 Project: HBase Issue Type: Bug Components: hadoop2, mapreduce Affects Versions: 0.94.10 Environment: weblogic windows Reporter: 刘泓 Attachments: TableMapReduceUtil.class, TableMapReduceUtil.java java.lang.NullPointerException at java.io.File.init(File.java:222) at java.util.zip.ZipFile.init(ZipFile.java:75) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:144) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:221) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:87) at com.easymap.ezserver6.map.source.hbase.convert.HBaseMapMerge.beginMerge(HBaseMapMerge.java:163) at com.easymap.ezserver6.app.servlet.EzMapToHbaseService.doPost(EzMapToHbaseService.java:32) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227) at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175) at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3594) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121) at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2202) at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2108) at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1432) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201) at weblogic.work.ExecuteThread.run(ExecuteThread.java:173) by respective the hbase source code under weblogic,we found TableMapReduceUtil.findOrCreateJar returned string is null.because under tomcat ,jar file's protocol is jar type, but under weblogic ,jar file's protocol is zip type,and the findOrCreateJar method cann't resolve zip type, so we should join zip type judgement -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816811#comment-13816811 ] Andrew Purtell commented on HBASE-9915: --- +1 Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Reopened] (HBASE-9913) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-9913: - Reopening ticket. Nothing has been committed so nothing has been fixed. weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException Key: HBASE-9913 URL: https://issues.apache.org/jira/browse/HBASE-9913 Project: HBase Issue Type: Bug Components: hadoop2, mapreduce Affects Versions: 0.94.10 Environment: weblogic windows Reporter: 刘泓 Attachments: TableMapReduceUtil.class, TableMapReduceUtil.java java.lang.NullPointerException at java.io.File.init(File.java:222) at java.util.zip.ZipFile.init(ZipFile.java:75) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:144) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:221) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:87) at com.easymap.ezserver6.map.source.hbase.convert.HBaseMapMerge.beginMerge(HBaseMapMerge.java:163) at com.easymap.ezserver6.app.servlet.EzMapToHbaseService.doPost(EzMapToHbaseService.java:32) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227) at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175) at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3594) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121) at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2202) at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2108) at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1432) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201) at weblogic.work.ExecuteThread.run(ExecuteThread.java:173) by respective the hbase source code under weblogic,we found TableMapReduceUtil.findOrCreateJar returned string is null.because under tomcat ,jar file's protocol is jar type, but under weblogic ,jar file's protocol is zip type,and the findOrCreateJar method cann't resolve zip type, so we should join zip type judgement -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9913) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9913: Component/s: mapreduce weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException Key: HBASE-9913 URL: https://issues.apache.org/jira/browse/HBASE-9913 Project: HBase Issue Type: Bug Components: hadoop2, mapreduce Affects Versions: 0.94.10 Environment: weblogic windows Reporter: 刘泓 Attachments: TableMapReduceUtil.class, TableMapReduceUtil.java java.lang.NullPointerException at java.io.File.init(File.java:222) at java.util.zip.ZipFile.init(ZipFile.java:75) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:144) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:221) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:87) at com.easymap.ezserver6.map.source.hbase.convert.HBaseMapMerge.beginMerge(HBaseMapMerge.java:163) at com.easymap.ezserver6.app.servlet.EzMapToHbaseService.doPost(EzMapToHbaseService.java:32) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227) at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292) at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175) at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3594) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121) at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2202) at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2108) at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1432) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201) at weblogic.work.ExecuteThread.run(ExecuteThread.java:173) by respective the hbase source code under weblogic,we found TableMapReduceUtil.findOrCreateJar returned string is null.because under tomcat ,jar file's protocol is jar type, but under weblogic ,jar file's protocol is zip type,and the findOrCreateJar method cann't resolve zip type, so we should join zip type judgement -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline
[ https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816817#comment-13816817 ] Demai Ni commented on HBASE-9047: - [~jdcryans], could you please review the new patch again? thanks the patch looks clean this time. the javadoc is unrelated (logged in HBASE-9916). I am still not sure about the 'mvn site goal to fail' though. Demai Tool to handle finishing replication when the cluster is offline Key: HBASE-9047 URL: https://issues.apache.org/jira/browse/HBASE-9047 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0 Reporter: Jean-Daniel Cryans Assignee: Demai Ni Fix For: 0.98.0 Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch We're having a discussion on the mailing list about replicating the data on a cluster that was shut down in an offline fashion. The motivation could be that you don't want to bring HBase back up but still need that data on the slave. So I have this idea of a tool that would be running on the master cluster while it is down, although it could also run at any time. Basically it would be able to read the replication state of each master region server, finish replicating what's missing to all the slave, and then clear that state in zookeeper. The code that handles replication does most of that already, see ReplicationSourceManager and ReplicationSource. Basically when ReplicationSourceManager.init() is called, it will check all the queues in ZK and try to grab those that aren't attached to a region server. If the whole cluster is down, it will grab all of them. The beautiful thing here is that you could start that tool on all your machines and the load will be spread out, but that might not be a big concern if replication wasn't lagging since it would take a few seconds to finish replicating the missing data for each region server. I'm guessing when starting ReplicationSourceManager you'd give it a fake region server ID, and you'd tell it not to start its own source. FWIW the main difference in how replication is handled between Apache's HBase and Facebook's is that the latter is always done separately of HBase itself. This jira isn't about doing that. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9890: - Fix Version/s: (was: 0.94.13) 0.94.14 MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816831#comment-13816831 ] stack commented on HBASE-9775: -- Playing w/ YCSB, I see that as is, we are pretty well-behaved now. A single client will grow its threads to just under two per server and would hold there roughly. Sometimes it will expand a little beyond this but these are threads that are just waiting to be dropped by the pool. On my small cluster of 5 nodes, upping the clients to 8 on a 16 core CPU, I was doing about 600% burn and the throughput was at just over twice the single thread. To be continued. I made this change in YCSB to see if more connections would get me more throughput: {code} @@ -117,7 +122,8 @@ public class HBaseClient extends com.yahoo.ycsb.DB public void getHTable(String table) throws IOException { synchronized (tableLock) { -_hTable = new HTable(config, table); +_hConnection = HConnectionManager.createConnection(config); +_hTable = _hConnection.getTable(table); //2 suggestions from http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html _hTable.setAutoFlush(false); _hTable.setWriteBufferSize(1024*1024*12); {code} What I saw was that my client now had 2k+ threads running in it. All but a few were just idle waiting, doing nothing. The burn was up, around 800%. Didn't bother checking throughput. So for Elliott test above, he probably had sane number of threads in his test. But if folks follow our new receipe where they do createConnection().getTable(TableName) then they will have clients w/ at least 256 threads just sitting there hanging out. Let me fix that one. Client write path perf issues - Key: HBASE-9775 URL: https://issues.apache.org/jira/browse/HBASE-9775 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.96.0 Reporter: Elliott Clark Priority: Critical Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, ycsb_insert_94_vs_96.png Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9917) Fix it so Default Connection Pool does not spin up max threads even when not needed
stack created HBASE-9917: Summary: Fix it so Default Connection Pool does not spin up max threads even when not needed Key: HBASE-9917 URL: https://issues.apache.org/jira/browse/HBASE-9917 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Testing, I noticed that if we use the HConnection executor service as opposed to the executor service that is created when you create an HTable without passing in a connection: i.e HConnectionManager.createConnection(config).getTable(tableName) vs HTable(config, tableName) ... then we will spin up the max 256 threads and they will just hang out though not being used. We are encouraging HConnection#getTable over new HTable so worth fixing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9917) Fix it so Default Connection Pool does not spin up max threads even when not needed
[ https://issues.apache.org/jira/browse/HBASE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9917: - Status: Patch Available (was: Open) Fix it so Default Connection Pool does not spin up max threads even when not needed --- Key: HBASE-9917 URL: https://issues.apache.org/jira/browse/HBASE-9917 Project: HBase Issue Type: Sub-task Components: Client Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Attachments: pool.txt Testing, I noticed that if we use the HConnection executor service as opposed to the executor service that is created when you create an HTable without passing in a connection: i.e HConnectionManager.createConnection(config).getTable(tableName) vs HTable(config, tableName) ... then we will spin up the max 256 threads and they will just hang out though not being used. We are encouraging HConnection#getTable over new HTable so worth fixing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9917) Fix it so Default Connection Pool does not spin up max threads even when not needed
[ https://issues.apache.org/jira/browse/HBASE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9917: - Attachment: pool.txt Small patch. Adds a core thread count and smaller keep alive. Also gives the pool threads a (shorter) name tied to the HConnection instance. Finally, queue size is tied to max thread size. Fix it so Default Connection Pool does not spin up max threads even when not needed --- Key: HBASE-9917 URL: https://issues.apache.org/jira/browse/HBASE-9917 Project: HBase Issue Type: Sub-task Components: Client Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Attachments: pool.txt Testing, I noticed that if we use the HConnection executor service as opposed to the executor service that is created when you create an HTable without passing in a connection: i.e HConnectionManager.createConnection(config).getTable(tableName) vs HTable(config, tableName) ... then we will spin up the max 256 threads and they will just hang out though not being used. We are encouraging HConnection#getTable over new HTable so worth fixing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9918) MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery
Jeffrey Zhong created HBASE-9918: Summary: MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery Key: HBASE-9918 URL: https://issues.apache.org/jira/browse/HBASE-9918 Project: HBase Issue Type: Bug Reporter: Jeffrey Zhong TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry always failed at the following verification for me in my dev env(you have to run the single test not the whole TestZooKeeper suite to reproduce) {code} assertEquals(Number of rows should be equal to number of puts., numberOfPuts, numberOfRows); {code} We missed two ZK listeners after master recovery MasterAddressTracker ZKNamespaceManager. My current patch is to fix the JIRA issue while I'm wondering if we should totally remove the master failover implementation when ZK session expired because this causes reinitialize HMaster partially which is error prone and not a clean state to start from. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9919) Fix resource leak in MultiTableInputFormatBase
Ted Yu created HBASE-9919: - Summary: Fix resource leak in MultiTableInputFormatBase Key: HBASE-9919 URL: https://issues.apache.org/jira/browse/HBASE-9919 Project: HBase Issue Type: Bug Reporter: Ted Yu In MultiTableInputFormatBase#createRecordReader(), table is not closed. {code} HTable table = new HTable(context.getConfiguration(), tSplit.getTableName()); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816844#comment-13816844 ] Hadoop QA commented on HBASE-9890: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612715/HBASE-9890-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7783//console This message is automatically generated. MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at
[jira] [Updated] (HBASE-9900) Fix unintended byte[].toString in AccessController
[ https://issues.apache.org/jira/browse/HBASE-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-9900: -- Resolution: Fixed Fix Version/s: 0.96.1 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) It is keyed by String. Committed to trunk and 0.96 branch. Fix unintended byte[].toString in AccessController -- Key: HBASE-9900 URL: https://issues.apache.org/jira/browse/HBASE-9900 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.0, 0.96.1 Attachments: 9900.patch Found while running FindBugs for another change. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-9890: --- Attachment: HBASE-9890-94-v4.patch MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-94-v4.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-9818: --- Attachment: trunk-9818_v1.1.patch Attached v1.1 that fixed some unit tests failures. NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Ted Yu Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch, trunk-9818_v1.1.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat} -- This message was
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816854#comment-13816854 ] Gary Helmling commented on HBASE-9890: -- +1 on 94 v4 patch as well. MR jobs are not working if started by a delegated user -- Key: HBASE-9890 URL: https://issues.apache.org/jira/browse/HBASE-9890 Project: HBase Issue Type: Bug Components: mapreduce, security Affects Versions: 0.98.0, 0.94.12, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, HBASE-9890-94-v4.patch, HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch, HBASE-9890-v3.patch, HBASE-9890-v4.patch If Map-Reduce jobs are started with by a proxy user that has already the delegation tokens, we get an exception on obtain token since the proxy user doesn't have the kerberos auth. For example: * If we use oozie to execute RowCounter - oozie will get the tokens required (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter tries to obtain the token, it will get an exception. * If we use oozie to execute LoadIncrementalHFiles - oozie will get the tokens required (HDFS_DELEGATION_TOKEN) and it will start the LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the token, it will get an exception. {code} org.apache.hadoop.hbase.security.AccessDeniedException: Token generation only allowed for Kerberos authenticated clients at org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) {code} {code} org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9920) Lower OK_FINDBUGS_WARNINGS in test-patch.properties
Ted Yu created HBASE-9920: - Summary: Lower OK_FINDBUGS_WARNINGS in test-patch.properties Key: HBASE-9920 URL: https://issues.apache.org/jira/browse/HBASE-9920 Project: HBase Issue Type: Task Reporter: Ted Yu HBASE-9903 removed generated classes from findbugs checking. OK_FINDBUGS_WARNINGS in test-patch.properties should be lowered. According to https://builds.apache.org/job/PreCommit-HBASE-Build/7776/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html , there were: 3 warnings for org.apache.hadoop.hbase.generated classes 19 warnings for org.apache.hadoop.hbase.tmpl classes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9854) initial documentation for stripe compactions
[ https://issues.apache.org/jira/browse/HBASE-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816864#comment-13816864 ] Sergey Shelukhin commented on HBASE-9854: - [~stack] what do you think? above comment initial documentation for stripe compactions Key: HBASE-9854 URL: https://issues.apache.org/jira/browse/HBASE-9854 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial documentation for stripe compactions (distill from attached docs, make up to date, put somewhere like book) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9854) initial documentation for stripe compactions
[ https://issues.apache.org/jira/browse/HBASE-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816863#comment-13816863 ] Sergey Shelukhin commented on HBASE-9854: - Here's a preview... wdyt? h3. Introduction Stripe compactions is an experimental feature added in HBase 0.98 which aims to improve compactions for large regions or non-uniformly distributed row keys. In order to achieve smaller and/or more granular compactions, the store files within a region are maintained separately for several row-key sub-ranges, or stripes, of the region. The division is not visible to the higher levels of the system, so externally each region functions as before. This feature is fully compatible with default compactions - it can be enabled for existing tables, and the table will continue to operate normally if it's disabled later. h3. When to use You might want to consider using this feature if you have: * large regions (in that case, you can get the positive effect of much smaller regions without additional memstore and region management overhead); or * non-uniform row keys, e.g. time dimension in a key (in that case, only the stripes receiving the new keys will keep compacting - old data will not compact as much, or at all). According to perf testing performed, in these case the read performance can improve somewhat, and the read and write performance variability due to compactions is greatly reduced. There's overall perf improvement on large, non-uniform row key regions (hash-prefixed timestamp key) over long term. All of these performance gains are best realized when table is already large. In future, the perf improvement might also extend to region splits. h3. How to enable To use stripe compactions for a table or a column family, you should set its {{hbase.hstore.engine.class}} to {{org.apache.hadoop.hbase.regionserver.StripeStoreEngine}}. Due to the nature of compactions, you also need to set the blocking file count to a high number (100 is a good default, which is 10 times the normal default of 10). If changing the existing table, you should do it when it is disabled. Examples: {code} alter 'orders_table', CONFIGURATION = {'hbase.hstore.engine.class' = 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' = '100'} alter 'orders_table', {NAME = 'blobs_cf', CONFIGURATION = {'hbase.hstore.engine.class' = 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' = '100'}} create 'orders_table', 'blobs_cf', CONFIGURATION = {'hbase.hstore.engine.class' = 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' = '100'} {code} Then, you can configure the other options if needed (see below) and enable the table. To switch back to default compactions, set {{hbase.hstore.engine.class}} to nil to unset it; or set it explicitly to {{org.apache.hadoop.hbase.regionserver.DefaultStoreEngine}} (this also needs to be done on a disabled table). When you enable a large table after changing the store engine either way, a major compaction will likely be performed on most regions. This is not a problem with new tables. h3. How to configure All of the settings described below are best set on table/cf level (with the table disabled first, for the settings to apply), similar to the above, e.g. {code} alter 'orders_table', CONFIGURATION = {'key' = 'value', ..., 'key' = 'value'}} {code} h4. Region and stripe sizing Based on your region sizing, you might want to also change your stripe sizing. By default, your new regions will start with one stripe. When the stripe is too big (16 memstore flushes size), on next compaction it will be split into two stripes. Stripe splitting will continue in a similar manner as the region grows, until the region itself is big enough to split (region split will work the same as with default compactions). You can improve this pattern for your data. You should generally aim at stripe size of at least 1Gb, and about 8-12 stripes for uniform row keys - so, for example if your regions are 30 Gb, 12x2.5Gb stripes might be a good idea. The settings are as follows: ||Setting||Notes|| |{{hbase.store.stripe.}} {{initialStripeCount}}|Initial stripe count to create. You can use it as follows: * for relatively uniform row keys, if you know the approximate target number of stripes from the above, you can avoid some splitting overhead by starting w/several stripes (2, 5, 10...). Note that if the early data is not representative of overall row key distribution, this will not be as efficient. * for existing tables with lots of data, you can use this to pre-split stripes. * for e.g. hash-prefixed sequential keys, with more than one hash prefix per region, you know that some pre-splitting makes sense.| |{{hbase.store.stripe.}} {{sizeToSplit}}|Maximum stripe size before it's
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816867#comment-13816867 ] Hudson commented on HBASE-9885: --- SUCCESS: Integrated in hbase-0.96-hadoop2 #115 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/115/]) HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539691) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java Avoid some Result creation in protobuf conversions -- Key: HBASE-9885 URL: https://issues.apache.org/jira/browse/HBASE-9885 Project: HBase Issue Type: Bug Components: Client, Protobufs, regionserver Affects Versions: 0.98.0, 0.96.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.96.1 Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, 9885.v3.patch, 9885.v4.patch We creates a lot of Result that we could avoid, as they contain nothing else than a boolean value. We create sometimes a protobuf builder as well on this path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816866#comment-13816866 ] Hudson commented on HBASE-9909: --- SUCCESS: Integrated in hbase-0.96-hadoop2 #115 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/115/]) HBASE-9909 TestHFilePerformance should not be a unit test, but a tool (enis: rev 1539767) * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java TestHFilePerformance should not be a unit test, but a tool -- Key: HBASE-9909 URL: https://issues.apache.org/jira/browse/HBASE-9909 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 Attachments: hbase-9909_v1.patch TestHFilePerformance is a very old test, which does not test anything, but a perf evaluation tool. It is not clear to me whether there is any utility for keeping it, but that should at least be converted to be a tool. Note that TestHFile already covers the unit test cases (writing hfile with none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9792) Region states should update last assignments when a region is opened.
[ https://issues.apache.org/jira/browse/HBASE-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816865#comment-13816865 ] Hudson commented on HBASE-9792: --- SUCCESS: Integrated in hbase-0.96-hadoop2 #115 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/115/]) HBASE-9792 Region states should update last assignments when a region is opened (jxiang: rev 1539731) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java Region states should update last assignments when a region is opened. - Key: HBASE-9792 URL: https://issues.apache.org/jira/browse/HBASE-9792 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0, 0.96.1 Attachments: trunk-9792.patch, trunk-9792_v2.patch, trunk-9792_v3.1.patch, trunk-9792_v3.patch Currently, we update a region's last assignment region server when the region is online. We should do this sooner, when the region is moved to OPEN state. CM could kill this region server before we delete the znode and online the region. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-9808: --- Attachment: (was: HBASE-9808-v3.patch) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9918) MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery
[ https://issues.apache.org/jira/browse/HBASE-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-9918: - Attachment: HBase-9918.patch This patch also refactors TestZooKeeper test suite so that each individual test case inside won't affect others. MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery --- Key: HBASE-9918 URL: https://issues.apache.org/jira/browse/HBASE-9918 Project: HBase Issue Type: Bug Reporter: Jeffrey Zhong Attachments: HBase-9918.patch TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry always failed at the following verification for me in my dev env(you have to run the single test not the whole TestZooKeeper suite to reproduce) {code} assertEquals(Number of rows should be equal to number of puts., numberOfPuts, numberOfRows); {code} We missed two ZK listeners after master recovery MasterAddressTracker ZKNamespaceManager. My current patch is to fix the JIRA issue while I'm wondering if we should totally remove the master failover implementation when ZK session expired because this causes reinitialize HMaster partially which is error prone and not a clean state to start from. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-3787) Increment is non-idempotent but client retries RPC
[ https://issues.apache.org/jira/browse/HBASE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816873#comment-13816873 ] Sergey Shelukhin commented on HBASE-3787: - [~ndimiduk] asked me... some review guideline (of course you can review in any order). 1) Client nonce generator, as well as additions to testMultiParallel and test log replay to get the idea of the feature. 2) Test of the nonce manager and nonce manager to see server nonce handling and how it works. 3) Plumbing (most of the patch), unfortunately there isn't any good order to review plumbing... perhaps: a) protobuf and client changes. b) server and log replay changes. Increment is non-idempotent but client retries RPC -- Key: HBASE-3787 URL: https://issues.apache.org/jira/browse/HBASE-3787 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.4, 0.95.2 Reporter: dhruba borthakur Assignee: Sergey Shelukhin Priority: Blocker Attachments: HBASE-3787-partial.patch, HBASE-3787-v0.patch, HBASE-3787-v1.patch, HBASE-3787-v2.patch, HBASE-3787-v3.patch, HBASE-3787-v4.patch, HBASE-3787-v5.patch, HBASE-3787-v5.patch, HBASE-3787-v6.patch, HBASE-3787-v7.patch, HBASE-3787-v8.patch The HTable.increment() operation is non-idempotent. The client retries the increment RPC a few times (as specified by configuration) before throwing an error to the application. This makes it possible that the same increment call be applied twice at the server. For increment operations, is it better to use HConnectionManager.getRegionServerWithoutRetries()? Another option would be to enhance the IPC module to make the RPC server correctly identify if the RPC is a retry attempt and handle accordingly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816878#comment-13816878 ] Hudson commented on HBASE-9909: --- SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #831 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/831/]) HBASE-9909 TestHFilePerformance should not be a unit test, but a tool (enis: rev 1539766) * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java TestHFilePerformance should not be a unit test, but a tool -- Key: HBASE-9909 URL: https://issues.apache.org/jira/browse/HBASE-9909 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 Attachments: hbase-9909_v1.patch TestHFilePerformance is a very old test, which does not test anything, but a perf evaluation tool. It is not clear to me whether there is any utility for keeping it, but that should at least be converted to be a tool. Note that TestHFile already covers the unit test cases (writing hfile with none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-9808: --- Attachment: HBASE-9808-v3.patch org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, HBASE-9808-v3.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9295) Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator
[ https://issues.apache.org/jira/browse/HBASE-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816879#comment-13816879 ] Hudson commented on HBASE-9295: --- SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #831 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/831/]) HBASE-9295 Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator (tedyu: rev 1539812) * /hbase/trunk/dev-support/test-patch.sh Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator - Key: HBASE-9295 URL: https://issues.apache.org/jira/browse/HBASE-9295 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.0 Attachments: 9295-v1.txt, 9295-v2.txt There were two recent bug fixes (HBASE-9285 and HBASE-9238) for the case where the TreeMap keyed by byte[] doesn't use proper comparator: {code} new TreeMapbyte[], ...() {code} test-patch.sh should be able to detect this situation and report accordingly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816881#comment-13816881 ] Gustavo Anatoly commented on HBASE-9808: Interesting, Nick :). Has been attached patch (v3) again, but with whitespace removed. [https://reviews.apache.org/r/15327/] Thanks for the tip. org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, HBASE-9808-v3.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9919) Fix resource leak in MultiTableInputFormatBase
[ https://issues.apache.org/jira/browse/HBASE-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9919: Status: Patch Available (was: Open) Fix resource leak in MultiTableInputFormatBase -- Key: HBASE-9919 URL: https://issues.apache.org/jira/browse/HBASE-9919 Project: HBase Issue Type: Bug Reporter: Ted Yu Attachments: HBASE-9919.00.patch In MultiTableInputFormatBase#createRecordReader(), table is not closed. {code} HTable table = new HTable(context.getConfiguration(), tSplit.getTableName()); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9919) Fix resource leak in MultiTableInputFormatBase
[ https://issues.apache.org/jira/browse/HBASE-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9919: Attachment: HBASE-9919.00.patch Fix resource leak in MultiTableInputFormatBase -- Key: HBASE-9919 URL: https://issues.apache.org/jira/browse/HBASE-9919 Project: HBase Issue Type: Bug Reporter: Ted Yu Attachments: HBASE-9919.00.patch In MultiTableInputFormatBase#createRecordReader(), table is not closed. {code} HTable table = new HTable(context.getConfiguration(), tSplit.getTableName()); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816892#comment-13816892 ] Lars Hofhansl commented on HBASE-9915: -- Will commit if HadoopQA does not find any problems. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9295) Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator
[ https://issues.apache.org/jira/browse/HBASE-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816904#comment-13816904 ] Hudson commented on HBASE-9295: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-9295 Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator (tedyu: rev 1539812) * /hbase/trunk/dev-support/test-patch.sh Allow test-patch.sh to detect TreeMap keyed by byte[] which doesn't use proper comparator - Key: HBASE-9295 URL: https://issues.apache.org/jira/browse/HBASE-9295 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.0 Attachments: 9295-v1.txt, 9295-v2.txt There were two recent bug fixes (HBASE-9285 and HBASE-9238) for the case where the TreeMap keyed by byte[] doesn't use proper comparator: {code} new TreeMapbyte[], ...() {code} test-patch.sh should be able to detect this situation and report accordingly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816905#comment-13816905 ] Hudson commented on HBASE-9885: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539692) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java Avoid some Result creation in protobuf conversions -- Key: HBASE-9885 URL: https://issues.apache.org/jira/browse/HBASE-9885 Project: HBase Issue Type: Bug Components: Client, Protobufs, regionserver Affects Versions: 0.98.0, 0.96.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.96.1 Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, 9885.v3.patch, 9885.v4.patch We creates a lot of Result that we could avoid, as they contain nothing else than a boolean value. We create sometimes a protobuf builder as well on this path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9792) Region states should update last assignments when a region is opened.
[ https://issues.apache.org/jira/browse/HBASE-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816901#comment-13816901 ] Hudson commented on HBASE-9792: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-9792 Region states should update last assignments when a region is opened (jxiang: rev 1539728) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java Region states should update last assignments when a region is opened. - Key: HBASE-9792 URL: https://issues.apache.org/jira/browse/HBASE-9792 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0, 0.96.1 Attachments: trunk-9792.patch, trunk-9792_v2.patch, trunk-9792_v3.1.patch, trunk-9792_v3.patch Currently, we update a region's last assignment region server when the region is online. We should do this sooner, when the region is moved to OPEN state. CM could kill this region server before we delete the znode and online the region. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)
[ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816903#comment-13816903 ] Hudson commented on HBASE-8741: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-8741 Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs) (stack: rev 1539743) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationHLogReaderManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs) Key: HBASE-8741 URL: https://issues.apache.org/jira/browse/HBASE-8741 Project: HBase Issue Type: Bug Components: MTTR Affects Versions: 0.95.1 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.98.0 Attachments: HBASE-8741-trunk-v6.1-rebased.patch, HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, HBASE-8741-trunk-v6.4.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, HBASE-8741-v5.patch Currently, when opening a region, we find the maximum sequence ID from all its HFiles and then set the LogSequenceId of the log (in case the later is at a small value). This works good in recovered.edits case as we are not writing to the region until we have replayed all of its previous edits. With distributed log replay, if we want to enable writes while a region is under recovery, we need to make sure that the logSequenceId maximum logSequenceId of the old regionserver. Otherwise, we might have a situation where new edits have same (or smaller) sequenceIds. We can store region level information in the WALTrailer, than this scenario could be avoided by: a) reading the trailer of the last completed file, i.e., last wal file which has a trailer and, b) completely reading the last wal file (this file would not have the trailer, so it needs to be read completely). In future, if we switch to multi wal file, we could read the trailer for all completed WAL files, and reading the remaining incomplete files. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816906#comment-13816906 ] Hudson commented on HBASE-9003: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-9003 Remove the jamon generated classes from the findbugs analysis (nkeywal: rev 1539599) * /hbase/trunk/dev-support/findbugs-exclude.xml * /hbase/trunk/dev-support/test-patch.properties TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.92.2, 0.95.1, 0.94.9 Reporter: Esteban Gutierrez This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816902#comment-13816902 ] Hudson commented on HBASE-9909: --- SUCCESS: Integrated in HBase-TRUNK #4672 (See [https://builds.apache.org/job/HBase-TRUNK/4672/]) HBASE-9909 TestHFilePerformance should not be a unit test, but a tool (enis: rev 1539766) * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java TestHFilePerformance should not be a unit test, but a tool -- Key: HBASE-9909 URL: https://issues.apache.org/jira/browse/HBASE-9909 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 Attachments: hbase-9909_v1.patch TestHFilePerformance is a very old test, which does not test anything, but a perf evaluation tool. It is not clear to me whether there is any utility for keeping it, but that should at least be converted to be a tool. Note that TestHFile already covers the unit test cases (writing hfile with none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816908#comment-13816908 ] Hudson commented on HBASE-9909: --- FAILURE: Integrated in hbase-0.96 #182 (See [https://builds.apache.org/job/hbase-0.96/182/]) HBASE-9909 TestHFilePerformance should not be a unit test, but a tool (enis: rev 1539767) * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java TestHFilePerformance should not be a unit test, but a tool -- Key: HBASE-9909 URL: https://issues.apache.org/jira/browse/HBASE-9909 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 Attachments: hbase-9909_v1.patch TestHFilePerformance is a very old test, which does not test anything, but a perf evaluation tool. It is not clear to me whether there is any utility for keeping it, but that should at least be converted to be a tool. Note that TestHFile already covers the unit test cases (writing hfile with none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9792) Region states should update last assignments when a region is opened.
[ https://issues.apache.org/jira/browse/HBASE-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816907#comment-13816907 ] Hudson commented on HBASE-9792: --- FAILURE: Integrated in hbase-0.96 #182 (See [https://builds.apache.org/job/hbase-0.96/182/]) HBASE-9792 Region states should update last assignments when a region is opened (jxiang: rev 1539731) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java Region states should update last assignments when a region is opened. - Key: HBASE-9792 URL: https://issues.apache.org/jira/browse/HBASE-9792 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0, 0.96.1 Attachments: trunk-9792.patch, trunk-9792_v2.patch, trunk-9792_v3.1.patch, trunk-9792_v3.patch Currently, we update a region's last assignment region server when the region is online. We should do this sooner, when the region is moved to OPEN state. CM could kill this region server before we delete the znode and online the region. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816909#comment-13816909 ] Hudson commented on HBASE-9885: --- FAILURE: Integrated in hbase-0.96 #182 (See [https://builds.apache.org/job/hbase-0.96/182/]) HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539691) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java Avoid some Result creation in protobuf conversions -- Key: HBASE-9885 URL: https://issues.apache.org/jira/browse/HBASE-9885 Project: HBase Issue Type: Bug Components: Client, Protobufs, regionserver Affects Versions: 0.98.0, 0.96.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.96.1 Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, 9885.v3.patch, 9885.v4.patch We creates a lot of Result that we could avoid, as they contain nothing else than a boolean value. We create sometimes a protobuf builder as well on this path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816911#comment-13816911 ] Lars Hofhansl commented on HBASE-9915: -- Some number with Phoenix. 5m rows, 5 long columns, 8 byte rowkeys, FAST_DIFF encoding, table fully flushed and major compacted, everything in the blockcache. (some weirdly named columns, this was a preexisting table that I mapped into Phoenix - with CREATE TABLE). ||Query||Without Patch||With Patch|| |select count\(*) from my5|12.8s|9.7s| |select count\(*) from my5 where 3 = 1|23.5s|11.8s| |select count\(*) from my5 where 3 1|34.8s|15.6s| |select avg(3) from my5|35.6s|17.4s| |select avg(0), avg(3) from my5|36.5s|20.2s| |select avg(0), avg(3) from my5 where 4 = 1|31.8s|15.4s| |select avg(0), avg(3) from my5 where 4 1|46.4s|25.1s| Note that Phoenix adds a fake column to each row (so each row has a known KV for things like COUNT) and (almost) always uses the ExplicitColumnTracker. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Attachment: 9915-trunk-v2.txt Sigh... Reattaching to get another run. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816913#comment-13816913 ] Lars Hofhansl commented on HBASE-9915: -- HadoopQA just failed due an environment issue. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9921) stripe compaction - findbugs and javadoc issues, some improvements
Sergey Shelukhin created HBASE-9921: --- Summary: stripe compaction - findbugs and javadoc issues, some improvements Key: HBASE-9921 URL: https://issues.apache.org/jira/browse/HBASE-9921 Project: HBase Issue Type: Task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9918) MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery
[ https://issues.apache.org/jira/browse/HBASE-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816916#comment-13816916 ] Jeffrey Zhong commented on HBASE-9918: -- [~toffer] Is it all right to call initNamespace() always in master initialization even in master failover case? Thanks. MasterAddressTracker ZKNamespaceManager ZK listeners are missed after master recovery --- Key: HBASE-9918 URL: https://issues.apache.org/jira/browse/HBASE-9918 Project: HBase Issue Type: Bug Reporter: Jeffrey Zhong Attachments: HBase-9918.patch TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry always failed at the following verification for me in my dev env(you have to run the single test not the whole TestZooKeeper suite to reproduce) {code} assertEquals(Number of rows should be equal to number of puts., numberOfPuts, numberOfRows); {code} We missed two ZK listeners after master recovery MasterAddressTracker ZKNamespaceManager. My current patch is to fix the JIRA issue while I'm wondering if we should totally remove the master failover implementation when ZK session expired because this causes reinitialize HMaster partially which is error prone and not a clean state to start from. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-9895: - Attachment: hbase-9895.patch Re-attach because the QA run errors seems not related to this patch. 0.96 Import utility can't import an exported file from 0.94 --- Key: HBASE-9895 URL: https://issues.apache.org/jira/browse/HBASE-9895 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: hbase-9895.patch Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster cannot import 0.94 exported files. This issue is annoying because a user can't import his old archive files after upgrade or archives from others who are using 0.94. The ideal way is to catch deserialization error and then fall back to 0.94 format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-9895: - Attachment: (was: hbase-9895.patch) 0.96 Import utility can't import an exported file from 0.94 --- Key: HBASE-9895 URL: https://issues.apache.org/jira/browse/HBASE-9895 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: hbase-9895.patch Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster cannot import 0.94 exported files. This issue is annoying because a user can't import his old archive files after upgrade or archives from others who are using 0.94. The ideal way is to catch deserialization error and then fall back to 0.94 format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816919#comment-13816919 ] Hadoop QA commented on HBASE-9915: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612716/9915-trunk-v2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7784//console This message is automatically generated. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9921) stripe compaction - findbugs and javadoc issues, some improvements
[ https://issues.apache.org/jira/browse/HBASE-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-9921: Attachment: HBASE-9921.patch What jira title says... some tiny changes, in one place incorrect method is used (noticed while running something), and the file counts for compactions can be derived from the ones for default one in most cases (realized while writing the documentation), some dead code which remained from more complicated version of the patch. stripe compaction - findbugs and javadoc issues, some improvements -- Key: HBASE-9921 URL: https://issues.apache.org/jira/browse/HBASE-9921 Project: HBase Issue Type: Task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HBASE-9921.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816927#comment-13816927 ] Liu Shaohui commented on HBASE-9892: Thanks. [~stack] [~enish] Please help to review patch for 0.94 in RB again. If there is no problems, I will start to make it to trunk. Add info port to ServerName to support multi instances in a node Key: HBASE-9892 URL: https://issues.apache.org/jira/browse/HBASE-9892 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, HBASE-9892-0.94-v3.diff The full GC time of regionserver with big heap( 30G ) usually can not be controlled in 30s. At the same time, the servers with 64G memory are normal. So we try to deploy multi rs instances(2-3 ) in a single node and the heap of each rs is about 20G ~ 24G. Most of the things works fine, except the hbase web ui. The master get the RS info port from conf, which is suitable for this situation of multi rs instances in a node. So we add info port to ServerName. a. at the startup, rs report it's info port to Hmaster. b, For root region, rs write the servername with info port ro the zookeeper root-region-server node. c, For meta regions, rs write the servername with info port to root region d. For user regions, rs write the servername with info port to meta regions So hmaster and client can get info port from the servername. To test this feature, I change the rs num from 1 to 3 in standalone mode, so we can test it in standalone mode, I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows how Hoya handle this problem? PS: There are different formats for servername in zk node and meta table, i think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9921) stripe compaction - findbugs and javadoc issues, some improvements
[ https://issues.apache.org/jira/browse/HBASE-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-9921: Status: Patch Available (was: Open) stripe compaction - findbugs and javadoc issues, some improvements -- Key: HBASE-9921 URL: https://issues.apache.org/jira/browse/HBASE-9921 Project: HBase Issue Type: Task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HBASE-9921.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816929#comment-13816929 ] Andrew Purtell commented on HBASE-9915: --- The 0.94 patch passed a private Jenkins run. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8465) Auto-drop rollback snapshot for snapshot restore
[ https://issues.apache.org/jira/browse/HBASE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816930#comment-13816930 ] Enis Soztutar commented on HBASE-8465: -- I was reading the restore code for something else, and was very surprised by automatic taking of the snapshots. We also do not delete that snapshot even after the restore is successful. +1 on introducing an API where taking a snapshot before restore is configurable (through arg passing, not conf). Also in case the snapshot restore is success, we should delete the previous snapshot. The arg name for this (dropRollbackSnapshot) might be confusing since the user might think that it will drop the original snapshot, not the one for restore. Auto-drop rollback snapshot for snapshot restore Key: HBASE-8465 URL: https://issues.apache.org/jira/browse/HBASE-8465 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.0, 0.96.1 Attachments: 8465-trunk-v1.txt, 8465-trunk-v2.txt Below is an excerpt from snapshot restore javadoc: {code} * Restore the specified snapshot on the original table. (The table must be disabled) * Before restoring the table, a new snapshot with the current table state is created. * In case of failure, the table will be rolled back to the its original state. {code} We can improve the handling of rollbackSnapshot in two ways: 1. give better name to the rollbackSnapshot (adding {code}'-for-rollback-'{code}). Currently the name is of the form: String rollbackSnapshot = snapshotName + - + EnvironmentEdgeManager.currentTimeMillis(); 2. drop rollbackSnapshot at the end of restoreSnapshot() if the restore is successful. We can introduce new config param, named 'hbase.snapshot.restore.drop.rollback', to keep compatibility with current behavior. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9915: - Status: Open (was: Patch Available) Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816934#comment-13816934 ] Lars Hofhansl commented on HBASE-9915: -- Thanks Andy. Looks like HadoopQA was OK after all. Will check the Javadoc warning. Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9915) Severe performance bug: isSeeked() in EncodedScannerV2 is always false
[ https://issues.apache.org/jira/browse/HBASE-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816938#comment-13816938 ] Lars Hofhansl commented on HBASE-9915: -- The Javadoc warning is not from this. Going to commit a bit later. Nice improvement from a 10 line change :) Severe performance bug: isSeeked() in EncodedScannerV2 is always false -- Key: HBASE-9915 URL: https://issues.apache.org/jira/browse/HBASE-9915 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: 9915-0.94.txt, 9915-trunk-v2.txt, 9915-trunk-v2.txt, 9915-trunk.txt, profile.png While debugging why reseek is so slow I found that it is quite broken for encoded scanners. The problem is this: AbstractScannerV2.reseekTo(...) calls isSeeked() to check whether scanner was seeked or not. If it was it checks whether the KV we want to seek to is in the current block, if not it always consults the index blocks again. isSeeked checks the blockBuffer member, which is not used by EncodedScannerV2 and thus always returns false, which in turns causes an index lookup for each reseek. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9906: - Issue Type: Bug (was: New Feature) Restore snapshot fails to restore the meta edits sporadically --- Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: Bug Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch After snaphot restore, we see failures to find the table in meta: {code} disable 'tablefour' restore_snapshot 'snapshot_tablefour' enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED = fa41edf43fe3ee131db4a34b848ff432, NAME = 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY = '', ENDKEY = ''}, {ENCODED = 76d0e2b7ec3291afcaa82e18a56ccc30, NAME = 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9854) initial documentation for stripe compactions
[ https://issues.apache.org/jira/browse/HBASE-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816942#comment-13816942 ] Enis Soztutar commented on HBASE-9854: -- This looks excellent. +1. initial documentation for stripe compactions Key: HBASE-9854 URL: https://issues.apache.org/jira/browse/HBASE-9854 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial documentation for stripe compactions (distill from attached docs, make up to date, put somewhere like book) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9920) Lower OK_FINDBUGS_WARNINGS in test-patch.properties
[ https://issues.apache.org/jira/browse/HBASE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9920: -- Status: Patch Available (was: Open) Lower OK_FINDBUGS_WARNINGS in test-patch.properties --- Key: HBASE-9920 URL: https://issues.apache.org/jira/browse/HBASE-9920 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 9920.txt HBASE-9903 removed generated classes from findbugs checking. OK_FINDBUGS_WARNINGS in test-patch.properties should be lowered. According to https://builds.apache.org/job/PreCommit-HBASE-Build/7776/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html , there were: 3 warnings for org.apache.hadoop.hbase.generated classes 19 warnings for org.apache.hadoop.hbase.tmpl classes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HBASE-9920) Lower OK_FINDBUGS_WARNINGS in test-patch.properties
[ https://issues.apache.org/jira/browse/HBASE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-9920: - Assignee: Ted Yu Lower OK_FINDBUGS_WARNINGS in test-patch.properties --- Key: HBASE-9920 URL: https://issues.apache.org/jira/browse/HBASE-9920 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 9920.txt HBASE-9903 removed generated classes from findbugs checking. OK_FINDBUGS_WARNINGS in test-patch.properties should be lowered. According to https://builds.apache.org/job/PreCommit-HBASE-Build/7776/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html , there were: 3 warnings for org.apache.hadoop.hbase.generated classes 19 warnings for org.apache.hadoop.hbase.tmpl classes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9920) Lower OK_FINDBUGS_WARNINGS in test-patch.properties
[ https://issues.apache.org/jira/browse/HBASE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9920: -- Attachment: 9920.txt Lower OK_FINDBUGS_WARNINGS in test-patch.properties --- Key: HBASE-9920 URL: https://issues.apache.org/jira/browse/HBASE-9920 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: 9920.txt HBASE-9903 removed generated classes from findbugs checking. OK_FINDBUGS_WARNINGS in test-patch.properties should be lowered. According to https://builds.apache.org/job/PreCommit-HBASE-Build/7776/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html , there were: 3 warnings for org.apache.hadoop.hbase.generated classes 19 warnings for org.apache.hadoop.hbase.tmpl classes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-9818: -- Assignee: Jimmy Xiang (was: Ted Yu) NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch, trunk-9818_v1.1.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9906: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed this to trunk, 0.96 and 0.94. Thanks for reviews. Restore snapshot fails to restore the meta edits sporadically --- Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: Bug Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch After snaphot restore, we see failures to find the table in meta: {code} disable 'tablefour' restore_snapshot 'snapshot_tablefour' enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED = fa41edf43fe3ee131db4a34b848ff432, NAME = 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY = '', ENDKEY = ''}, {ENCODED = 76d0e2b7ec3291afcaa82e18a56ccc30, NAME = 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9917) Fix it so Default Connection Pool does not spin up max threads even when not needed
[ https://issues.apache.org/jira/browse/HBASE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816965#comment-13816965 ] Hadoop QA commented on HBASE-9917: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612744/pool.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7786//console This message is automatically generated. Fix it so Default Connection Pool does not spin up max threads even when not needed --- Key: HBASE-9917 URL: https://issues.apache.org/jira/browse/HBASE-9917 Project: HBase Issue Type: Sub-task Components: Client Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Attachments: pool.txt Testing, I noticed that if we use the HConnection executor service as opposed to the executor service that is created when you create an HTable without passing in a connection: i.e HConnectionManager.createConnection(config).getTable(tableName) vs HTable(config, tableName) ... then we will spin up the max 256 threads and they will just hang out though not being used. We are encouraging HConnection#getTable over new HTable so worth fixing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-9818: --- Assignee: Ted Yu (was: Jimmy Xiang) NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Ted Yu Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch, trunk-9818_v1.1.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816984#comment-13816984 ] stack commented on HBASE-9775: -- So far it seems that sharing a connection is better than a connection per thread in YCSB. With 8 clients into a 5-node cluster, sharing the connection is almost 50% faster. Will dig in. Client write path perf issues - Key: HBASE-9775 URL: https://issues.apache.org/jira/browse/HBASE-9775 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.96.0 Reporter: Elliott Clark Priority: Critical Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, ycsb_insert_94_vs_96.png Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816983#comment-13816983 ] Jimmy Xiang commented on HBASE-9818: Assigned back to Ted since he is still looking into it. NPE in HFileBlock#AbstractFSReader#readAtOffset --- Key: HBASE-9818 URL: https://issues.apache.org/jira/browse/HBASE-9818 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Ted Yu Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt, trunk-9818.patch, trunk-9818_v1.1.patch HFileBlock#istream seems to be null. I was wondering should we hide FSDataInputStreamWrapper#useHBaseChecksum. By the way, this happened when online schema change is enabled (encoding) {noformat} 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] regionserver.HRegionServer: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] regionserver.HRegionServer: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 53438 But the nextCallSeq got from client: 53437; request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: false next_call_seq: 53437 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:724) {noformat}
[jira] [Updated] (HBASE-9917) Fix it so Default Connection Pool does not spin up max threads even when not needed
[ https://issues.apache.org/jira/browse/HBASE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9917: - Attachment: 9917.txt Retry. Starts w/ zero threads. Let the client ramp up rather than have 8 idle when nothing going on. Fix it so Default Connection Pool does not spin up max threads even when not needed --- Key: HBASE-9917 URL: https://issues.apache.org/jira/browse/HBASE-9917 Project: HBase Issue Type: Sub-task Components: Client Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Attachments: 9917.txt, pool.txt Testing, I noticed that if we use the HConnection executor service as opposed to the executor service that is created when you create an HTable without passing in a connection: i.e HConnectionManager.createConnection(config).getTable(tableName) vs HTable(config, tableName) ... then we will spin up the max 256 threads and they will just hang out though not being used. We are encouraging HConnection#getTable over new HTable so worth fixing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9922) Need to delete a row based on the column namevalue (not the row key)... please provide the delete query for the same...
ranjini created HBASE-9922: -- Summary: Need to delete a row based on the column namevalue (not the row key)... please provide the delete query for the same... Key: HBASE-9922 URL: https://issues.apache.org/jira/browse/HBASE-9922 Project: HBase Issue Type: Bug Reporter: ranjini -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9923) In HBASE we need to delete a row based on the column namevalue (not using rowkey)...
ranjini created HBASE-9923: -- Summary: In HBASE we need to delete a row based on the column namevalue (not using rowkey)... Key: HBASE-9923 URL: https://issues.apache.org/jira/browse/HBASE-9923 Project: HBase Issue Type: Task Reporter: ranjini -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9900) Fix unintended byte[].toString in AccessController
[ https://issues.apache.org/jira/browse/HBASE-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817002#comment-13817002 ] Hudson commented on HBASE-9900: --- SUCCESS: Integrated in hbase-0.96 #183 (See [https://builds.apache.org/job/hbase-0.96/183/]) HBASE-9900. Fix unintended byte[].toString in AccessController (apurtell: rev 1539883) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java Fix unintended byte[].toString in AccessController -- Key: HBASE-9900 URL: https://issues.apache.org/jira/browse/HBASE-9900 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.0, 0.96.1 Attachments: 9900.patch Found while running FindBugs for another change. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817001#comment-13817001 ] Hudson commented on HBASE-9906: --- SUCCESS: Integrated in hbase-0.96 #183 (See [https://builds.apache.org/job/hbase-0.96/183/]) HBASE-9906 Restore snapshot fails to restore the meta edits sporadically (enis: rev 1539907) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/RestoreSnapshotHandler.java Restore snapshot fails to restore the meta edits sporadically --- Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: Bug Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch After snaphot restore, we see failures to find the table in meta: {code} disable 'tablefour' restore_snapshot 'snapshot_tablefour' enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED = fa41edf43fe3ee131db4a34b848ff432, NAME = 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY = '', ENDKEY = ''}, {ENCODED = 76d0e2b7ec3291afcaa82e18a56ccc30, NAME = 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817028#comment-13817028 ] Hudson commented on HBASE-9906: --- SUCCESS: Integrated in HBase-TRUNK #4673 (See [https://builds.apache.org/job/HBase-TRUNK/4673/]) HBASE-9906 Restore snapshot fails to restore the meta edits sporadically (enis: rev 1539906) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/RestoreSnapshotHandler.java Restore snapshot fails to restore the meta edits sporadically --- Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: Bug Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch After snaphot restore, we see failures to find the table in meta: {code} disable 'tablefour' restore_snapshot 'snapshot_tablefour' enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED = fa41edf43fe3ee131db4a34b848ff432, NAME = 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY = '', ENDKEY = ''}, {ENCODED = 76d0e2b7ec3291afcaa82e18a56ccc30, NAME = 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)