[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130383#comment-13130383 ] Ted Yu commented on HBASE-3581: --- If I understand the patch correctly, I think the following addendum should be applied: {code} Index: src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java === --- src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java (revision 1185920) +++ src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java (working copy) @@ -42,6 +42,6 @@ } static byte getErrorAndLengthSet() { -return LENGTH_BIT ERROR_BIT; +return LENGTH_BIT | ERROR_BIT; } } {code} hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130413#comment-13130413 ] gaojinchao commented on HBASE-3581: --- Ted's analysis makes sense. Our local test cases always fail, I will apply Ted's patch and run all testcases. hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130416#comment-13130416 ] Jonathan Gray commented on HBASE-3581: -- should rename method to getErrorOrLengthSet()? hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3581: -- Comment: was deleted (was: If I understand the patch correctly, I think the following addendum should be applied: {code} Index: src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java === --- src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java (revision 1185920) +++ src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java (working copy) @@ -42,6 +42,6 @@ } static byte getErrorAndLengthSet() { -return LENGTH_BIT ERROR_BIT; +return LENGTH_BIT | ERROR_BIT; } } {code}) hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130639#comment-13130639 ] ramkrishna.s.vasudevan commented on HBASE-3581: --- {code} + static byte getErrorAndLengthSet() { +return LENGTH_BIT ERROR_BIT; + } {code} This will always make the flag = 0. {code} + static boolean isError(final byte flag) { +return (flag ERROR_BIT) != 0; + } {code} {code} +// Read the flag byte +byte flag = in.readByte(); +boolean isError = ResponseFlag.isError(flag); +if (ResponseFlag.isLength(flag)) { + // Currently length if present is unused. {code} If the flag that we are reading here is the same flag that we have set while writing then the isError() method will return true only because flag = 0 ERROR_BIT = 0x1 So even if we make it {code} +return LENGTH_BIT | ERROR_BIT; } {code} will we have any change? If the flag that we try to read is not the one that we have set then the {code} +return LENGTH_BIT | ERROR_BIT; } {code} is valid. Correct me if am wrong hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid next operations (and instead reseek) when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kannan Muthukkaruppan updated HBASE-4585: - Description: When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to do next() instead of reseeking to the next column of interest. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. was: When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. Summary: Avoid next operations (and instead reseek) when current kv is deleted (was: Avoid seek operation when current kv is deleted) updated title and description of bug a little bit. Avoid next operations (and instead reseek) when current kv is deleted - Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.94.0 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to do next() instead of reseeking to the next column of interest. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4618) HBase backups
HBase backups - Key: HBASE-4618 URL: https://issues.apache.org/jira/browse/HBASE-4618 Project: HBase Issue Type: Umbrella Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Karthik Ranganathan We have been working on the ability to do backups in HBase with different levels of protection. This is an umbrella task for all the backup related changes. Roughly here are a few flavors of backups giving increasing levels of guarentees: 1. Per cf backups 2. Multi-cf backups with row atomicity preserved 3. Multi-cf backups with row atomicity and point in time recovery. On the perf dimension, here is a list of improvements: 1. Copy the files - regular hadoop cp 2. Use fast copy - copy blocks and stitch them together, saves top of rack bandwidth 3. Use fast copy with hard links - no file copy, it does only ext3 level linking. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4618) HBase backups
[ https://issues.apache.org/jira/browse/HBASE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Ranganathan updated HBASE-4618: --- Description: We have been working on the ability to do backups in HBase with different levels of protection. This is an umbrella task for all the backup related changes. Here are some kinds of changes - will create separate issues for them: Roughly here are a few flavors of backups giving increasing levels of guarentees: 1. Per cf backups 2. Multi-cf backups with row atomicity preserved 3. Multi-cf backups with row atomicity and point in time recovery. On the perf dimension, here is a list of improvements: 1. Copy the files - regular hadoop cp 2. Use fast copy - copy blocks and stitch them together, saves top of rack bandwidth 3. Use fast copy with hard links - no file copy, it does only ext3 level linking. On the durability of data side: 1. Ability to backup data onto the same racks as those running HBase 2. Intra-datacenter backup 3. Inter datacenter backup Restores: 1. Restore with a table name different from the backed up table name 2. Restore a backed up table wen HBase cluster is not running at restore time 3. Restore into a live and running cluster Operationally: 1. How to setup backups in live cluster 2. Setting up intra-DC 3. cross-DC backups 4. Verifying a backup is good was: We have been working on the ability to do backups in HBase with different levels of protection. This is an umbrella task for all the backup related changes. Roughly here are a few flavors of backups giving increasing levels of guarentees: 1. Per cf backups 2. Multi-cf backups with row atomicity preserved 3. Multi-cf backups with row atomicity and point in time recovery. On the perf dimension, here is a list of improvements: 1. Copy the files - regular hadoop cp 2. Use fast copy - copy blocks and stitch them together, saves top of rack bandwidth 3. Use fast copy with hard links - no file copy, it does only ext3 level linking. HBase backups - Key: HBASE-4618 URL: https://issues.apache.org/jira/browse/HBASE-4618 Project: HBase Issue Type: Umbrella Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Karthik Ranganathan We have been working on the ability to do backups in HBase with different levels of protection. This is an umbrella task for all the backup related changes. Here are some kinds of changes - will create separate issues for them: Roughly here are a few flavors of backups giving increasing levels of guarentees: 1. Per cf backups 2. Multi-cf backups with row atomicity preserved 3. Multi-cf backups with row atomicity and point in time recovery. On the perf dimension, here is a list of improvements: 1. Copy the files - regular hadoop cp 2. Use fast copy - copy blocks and stitch them together, saves top of rack bandwidth 3. Use fast copy with hard links - no file copy, it does only ext3 level linking. On the durability of data side: 1. Ability to backup data onto the same racks as those running HBase 2. Intra-datacenter backup 3. Inter datacenter backup Restores: 1. Restore with a table name different from the backed up table name 2. Restore a backed up table wen HBase cluster is not running at restore time 3. Restore into a live and running cluster Operationally: 1. How to setup backups in live cluster 2. Setting up intra-DC 3. cross-DC backups 4. Verifying a backup is good -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130762#comment-13130762 ] stack commented on HBASE-3581: -- Should I open a new issue? hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130761#comment-13130761 ] stack commented on HBASE-3581: -- Thanks lads for digging in on this. Stuff seemed to pass for me (I might have messed up tests though). Ram, looks like I made a mistake in the above if thats the test that I'm doing. hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130765#comment-13130765 ] Ted Yu commented on HBASE-3581: --- New issue would be nice. hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4619) TableOutputFormat.close() interferes with HBase clients in same JVM
TableOutputFormat.close() interferes with HBase clients in same JVM --- Key: HBASE-4619 URL: https://issues.apache.org/jira/browse/HBASE-4619 Project: HBase Issue Type: Bug Reporter: Dave Revell This appears in TableOutputFormat.java: {code} @Override public void close(TaskAttemptContext context) throws IOException { table.flushCommits(); // The following call will shutdown all connections to the cluster from // this JVM. It will close out our zk session otherwise zk wil log // expired sessions rather than closed ones. If any other HTable instance // running in this JVM, this next call will cause it damage. Presumption // is that the above this.table is only instance. HConnectionManager.deleteAllConnections(true); } {code} It's not a safe assumption that a single TableOutputFormat is the only HBase client in a JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4619) TableOutputFormat.close() interferes with HBase clients in same JVM
[ https://issues.apache.org/jira/browse/HBASE-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130767#comment-13130767 ] Ted Yu commented on HBASE-4619: --- This should be fixed by HBASE-4508 TableOutputFormat.close() interferes with HBase clients in same JVM --- Key: HBASE-4619 URL: https://issues.apache.org/jira/browse/HBASE-4619 Project: HBase Issue Type: Bug Reporter: Dave Revell This appears in TableOutputFormat.java: {code} @Override public void close(TaskAttemptContext context) throws IOException { table.flushCommits(); // The following call will shutdown all connections to the cluster from // this JVM. It will close out our zk session otherwise zk wil log // expired sessions rather than closed ones. If any other HTable instance // running in this JVM, this next call will cause it damage. Presumption // is that the above this.table is only instance. HConnectionManager.deleteAllConnections(true); } {code} It's not a safe assumption that a single TableOutputFormat is the only HBase client in a JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Priority: Blocker Fix For: 0.92.0 Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4620: - Attachment: 4620.txt Small patch I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-4620. -- Resolution: Fixed Assignee: stack Committed branch and trunk. Thanks Ted, Ram, and Gao for digging in on this one. I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-4462: -- Attachment: HBASE-4462_0.90.x.patch Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 Attachments: HBASE-4462_0.90.x.patch SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130791#comment-13130791 ] ramkrishna.s.vasudevan commented on HBASE-4462: --- Backported HBASE-2937. PoolMap is not used. If i try to use it then changes will be more. Submitting to get some reviews so that can be incorporated. Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 Attachments: HBASE-4462_0.90.x.patch SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-4462: -- Status: Patch Available (was: Open) Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 Attachments: HBASE-4462_0.90.x.patch SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4203) While master restarts and if the META region's state is OPENING then master cannot assign META until timeout monitor deducts
[ https://issues.apache.org/jira/browse/HBASE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-4203. --- Resolution: Fixed This was covered in HBASE-4015 While master restarts and if the META region's state is OPENING then master cannot assign META until timeout monitor deducts Key: HBASE-4203 URL: https://issues.apache.org/jira/browse/HBASE-4203 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Minor 1. Start Master and 2 RS. 2. If any exception happens while opening the META region the state in znode will be OPENING. 3. If at this point the master restarts then the master will start processing the regions in RIT. 4. If the znode is found to be in OPENING then master waits for timeout monitor to deduct and then call opening. 5. If default timeout monitor is configured(180 sec/30 min) then it will take 30 mins to open the META region itself. Soln: Better not to wait for the Timeout monitor period to open catalog tables on Master restart -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4554) Allow set/unset coprocessor table attributes from shell.
[ https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130820#comment-13130820 ] Mingjie Lai commented on HBASE-4554: @Ted. As we original proposed, we can add arbitrary content after coprocessor$. You suggested to only use (auto-generated) number after $. Either way should work. Allow set/unset coprocessor table attributes from shell. Key: HBASE-4554 URL: https://issues.apache.org/jira/browse/HBASE-4554 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Mingjie Lai Assignee: Mingjie Lai Fix For: 0.92.0 Table/region level coprocessor -- RegionObserver -- can be configured by setting a HTD's attribute which matches Coprocessor$*. Current shell -- alter -- cannot support to set/unset a table's arbitrary attribute. We need it in order to configure region level coprocessors to a table. Proposed new shell: {code} hbase shell alter 't1', METHOD = 'table_att', COPROCESSOR$1 = 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|' hbase shell describe 't1' {NAME = 't1', COPROCESSOR$1 = 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE = '134217728', ...} hbase shell alter 't1', METHOD = 'table_att_unset', COPROCESSOR$1 hbase shell describe 't1' {NAME = 't1', MAX_FILESIZE = '134217728', ...} {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130832#comment-13130832 ] Jonathan Gray commented on HBASE-4620: -- stack, doesn't the method name imply the existing behavior? should change the method name? I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130833#comment-13130833 ] Jonathan Gray commented on HBASE-4620: -- or this is meant to combine the two so the | is actually the right behavior for 'and'? hmm I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130839#comment-13130839 ] Ted Yu commented on HBASE-4620: --- Bitwise or matches the method name. I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4620) I broke the build when I submitted HBASE-3581 (Send length of the rpc response)
[ https://issues.apache.org/jira/browse/HBASE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130856#comment-13130856 ] stack commented on HBASE-4620: -- @Jon Name seems alright... Method is returning a byte where the two bits are set (Shouldn't have bothered w/ method in first place) I broke the build when I submitted HBASE-3581 (Send length of the rpc response) --- Key: HBASE-4620 URL: https://issues.apache.org/jira/browse/HBASE-4620 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4620.txt Thanks to Ted, Ram and Gao for figuring my messup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter
[ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130883#comment-13130883 ] jirapos...@reviews.apache.org commented on HBASE-4532: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2393/#review2615 --- Hi Liyin-- find the first pass review comments inlined. Haven't reviewed the test changes yet. Looking fwd to this optimization landing. src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/2393/#comment5908 remove qualifier from the comment, since all we are passing here is row and family (no column name). src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/2393/#comment5909 * remove qualifier from comment here too. * 80 char issue src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/2393/#comment5910 remove qualifier param. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java https://reviews.apache.org/r/2393/#comment5913 80 char issues. src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java https://reviews.apache.org/r/2393/#comment5914 can we enhance HFilePrettyPrinter to report info about the DeleteBloomFilter as well (provided HFile is V2). src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment5985 even though - even if src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment5987 what's the differerence between getPath() (in line 1027) and this.writer.getPath()? Did you mean to log the general delete Bloom filter instead? src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6015 not clear where you are using this -1 state src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6017 or this is no - or there is no src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6016 To make sure I understand this... for HFileV1 case or for HFileV2 + but without this fix, I am guessing deleteFamilyCnt will be equal to -1, and the fact that it doesn't have a bloomFilter will cause it to return true. That look's fine. Just not obvious. src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6014 space between cnt != src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6020 did you intend to initialize bloomTypeLog here as well? src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2393/#comment6021 bloomTypeLog is only initialized for GeneralBloomFilter case. If that's the intent, why not move the logging near line 1382? src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java https://reviews.apache.org/r/2393/#comment6027 In case there is a deleteFamily kv, there are two sub-cases here... a) we have ROWCOL bloom (in which case there is no DeleteFamilyBloomFilter) and we want to use the ROWCOL bloom filter itself. b) we have a DeleteFamilyBloomFilter. I don't see us taking advantage of (a) like we used to earlier. Isn't this a regression for the ROWCOL bloom case? And if so, TestBlocksRead should have caught it, no? src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java https://reviews.apache.org/r/2393/#comment6023 isSeekToEmptyColumn and useBloom should be separate flags I think. For example, if the CF had ROWCOL bloom, and the query for looking for row/0-length column, then with this change, we won't use the ROWCOL bloom filter even when it exists. Isn't it the case that we want to avoid using only the deleteFamilyBloom filter when isSeekToEmptyColumn is true? - Kannan On 2011-10-18 20:38:41, Liyin Tang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2393/ bq. --- bq. bq. (Updated 2011-10-18 20:38:41) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin Tang, Karthik Ranganathan, and Nicolas Spiegelberg. bq. bq. bq. Summary bq. --- bq. bq. HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. bq. This jira tries to
[jira] [Commented] (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130887#comment-13130887 ] Hudson commented on HBASE-3581: --- Integrated in HBase-TRUNK #2340 (See [https://builds.apache.org/job/HBase-TRUNK/2340/]) HBASE-4620 I broke the build when I submitted HBASE-3581 (Send length of the rpc response) stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/ResponseFlag.java hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3581-v2.txt, 3581-v3.txt, 3581-v4.txt, HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4621) TestAvroServer fails quite often intermittently
[ https://issues.apache.org/jira/browse/HBASE-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash Ashok reassigned HBASE-4621: -- Assignee: Akash Ashok TestAvroServer fails quite often intermittently --- Key: HBASE-4621 URL: https://issues.apache.org/jira/browse/HBASE-4621 Project: HBase Issue Type: Bug Reporter: Akash Ashok Assignee: Akash Ashok -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4436) Remove methods deprecated in 0.90 from TRUNK and 0.92
[ https://issues.apache.org/jira/browse/HBASE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130954#comment-13130954 ] Jonathan Hsieh commented on HBASE-4436: --- Already gone: * HBaseClusterTestCase (HBASE-4503) * HServerLoad.addRegionInfo (HBASE-1502) Trivial removals: * MultiPut*, * KevValue.createFirstOnRow, * Get.addColumns(byte[][] columns), * Put.add(byte[] column, long ts, byte[] value), * Delete.deleteColumns(byte[] column), * Delete.deleteColumns(byte[] column, long ts) * HBaseAdmin.modifyColumn(.., columnName, ..) * HColumnDescriptor.CompressionType enum * HConnectionManager.processBatchOfPuts / HConnection.processBatchOfPuts * Result.sorted() Things that require a little work: (touches many places or requires some code, will make separate sub-jiras) * RemoteExceptionHandler class (15 refs) * Scan methods (4 ref - might have bug) * HBaseTestCase class (47 references) I didn't encounter and thrift related problems. Remove methods deprecated in 0.90 from TRUNK and 0.92 - Key: HBASE-4436 URL: https://issues.apache.org/jira/browse/HBASE-4436 Project: HBase Issue Type: Task Reporter: stack Assignee: Jonathan Hsieh Priority: Critical Labels: noob Fix For: 0.92.0 Remove methods deprecated in 0.90 from codebase. i took a quick look. The messy bit is thrift referring to old stuff; will take a little work to do the convertion over to the new methods. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4621) TestAvroServer fails quite often intermittently
TestAvroServer fails quite often intermittently --- Key: HBASE-4621 URL: https://issues.apache.org/jira/browse/HBASE-4621 Project: HBase Issue Type: Bug Reporter: Akash Ashok -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4622) Remove trivial 0.90 deprecated code from 0.92 and trunk.
Remove trivial 0.90 deprecated code from 0.92 and trunk. Key: HBASE-4622 URL: https://issues.apache.org/jira/browse/HBASE-4622 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130956#comment-13130956 ] Akash Ashok commented on HBASE-4553: https://issues.apache.org/jira/browse/HBASE-4621 opened for the TestCaseFix . Shall modify and submit the patch out there. Thanks The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Priority: Critical Fix For: 0.92.0 Attachments: 4553-v5.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4623) Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92
Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92 --- Key: HBASE-4623 URL: https://issues.apache.org/jira/browse/HBASE-4623 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4624) Remove and convert @deprecated RemoteExceptionHandler.decodeRemoteException calls
Remove and convert @deprecated RemoteExceptionHandler.decodeRemoteException calls - Key: HBASE-4624 URL: https://issues.apache.org/jira/browse/HBASE-4624 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4625) Convert @deprecated HBaseTestCase tests in 0.90 into JUnit4 style tests in 0.92 and TRUNK
Convert @deprecated HBaseTestCase tests in 0.90 into JUnit4 style tests in 0.92 and TRUNK - Key: HBASE-4625 URL: https://issues.apache.org/jira/browse/HBASE-4625 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh This will class has 47 references so separating out into a separate subtask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4536: - Attachment: 4536-v15.txt Accommodate HBASE-4585. (3 line change) This is the patch I would like to commit to trunk. Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4536-v15.txt Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131004#comment-13131004 ] jirapos...@reviews.apache.org commented on HBASE-4536: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2677 --- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java https://reviews.apache.org/r/2178/#comment6040 If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well? (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?) Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java https://reviews.apache.org/r/2178/#comment6038 Why only ExplicitColumnTracker? A delete family marker that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java https://reviews.apache.org/r/2178/#comment6037 Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java https://reviews.apache.org/r/2178/#comment6039 If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()? - Prakash On 2011-10-18 21:43:38, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2178/ bq. --- bq. bq. (Updated 2011-10-18 21:43:38) bq. bq. bq. Review request for hbase, Ted Yu and Jonathan Gray. bq. bq. bq. Summary bq. --- bq. bq. HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. bq. This did not work for deletes, however. Deletes would always mask all puts in the past. bq. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. bq. These rows are still subject to TTL and/or VERSIONS. bq. bq. This changes the following: bq. 1. There is a new flag on HColumnDescriptor enabling that behavior. bq. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. bq. 3. Do not unconditionally collect all deleted rows during a compaction. bq. 4. Allow a raw Scan, which retrieves all delete markers and deleted rows. bq. bq. The change is small'ish, but the logic is intricate, so please review carefully. bq. bq. bq. This addresses bug HBASE-4536. bq. https://issues.apache.org/jira/browse/HBASE-4536 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 bq.
[jira] [Created] (HBASE-4626) Filters unnecessarily copy byte arrays...
Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4523) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places
[ https://issues.apache.org/jira/browse/HBASE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131018#comment-13131018 ] Ted Yu commented on HBASE-4523: --- +1 on patch dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places Key: HBASE-4523 URL: https://issues.apache.org/jira/browse/HBASE-4523 Project: HBase Issue Type: Bug Affects Versions: 0.90.4, 0.92.0 Reporter: Arpit Gupta Assignee: Eric Yang Fix For: 0.90.4, 0.92.0 Attachments: HBASE-4523.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131025#comment-13131025 ] jirapos...@reviews.apache.org commented on HBASE-4070: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2029/#review2678 --- +1 on patch. Just an ontological question. src/main/java/org/apache/hadoop/hbase/HServerLoad.java https://reviews.apache.org/r/2029/#comment6041 Is this a 'load'? Or is this an attribute of the server? We used to have an HServerInfo. This looks like something that would belong in there rather than in HSL? src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java https://reviews.apache.org/r/2029/#comment6043 Should we be able to ask a regionserver what CPs it has loaded from HBaseAdmin (Maybe its already in here)? src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java https://reviews.apache.org/r/2029/#comment6044 Can we get here from HBaseAdmin - Michael On 2011-10-16 16:22:16, Eugene Koontz wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2029/ bq. --- bq. bq. (Updated 2011-10-16 16:22:16) bq. bq. bq. Review request for hbase and Mingjie Lai. bq. bq. bq. Summary bq. --- bq. bq. Proposed fix for HBASE-4070. bq. bq. bq. This addresses bug HBASE-4070. bq. https://issues.apache.org/jira/browse/HBASE-4070 bq. bq. bq. Diffs bq. - bq. bq.src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon abeb850 bq.src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon be6fceb bq.src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 01bc1dd bq.src/main/java/org/apache/hadoop/hbase/HServerLoad.java 0c680e4 bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 92c959c bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 7d2f82e bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 50b49a6 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2e694a bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java eda5a9b bq. bq. Diff: https://reviews.apache.org/r/2029/diff bq. bq. bq. Testing bq. --- bq. bq. Two new tests : testRegionServerCoprocessorReported() and testMasterServerCoprocessorsReported() added to (existing) src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java. bq. bq. bq. Thanks, bq. bq. Eugene bq. bq. [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, master-web-ui.jpg, rs-status-web-ui.jpg HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131047#comment-13131047 ] Ted Yu commented on HBASE-4626: --- Interesting finding. WritableByteArrayComparable should be enhanced with: {code} public int compareTo(byte[] buffer, int offset, int length) {code} Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4627) Ability to specify a custom start/end to RegionSplitter
Ability to specify a custom start/end to RegionSplitter --- Key: HBASE-4627 URL: https://issues.apache.org/jira/browse/HBASE-4627 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg HBASE-4489 changed the default endKey on HexStringSplit from 7FFF... to ... While this is correct, existing users of 0.90 RegionSplitter have 7FFF as the end key in their schema and the last region will not split properly under this new code. We need to let the user specify a custom start/end key range for when situations like this arise. Optimally, we should also write the start/end key in META so we could figure this out implicitly instead of requiring the user to explicitly specify it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131047#comment-13131047 ] Ted Yu edited comment on HBASE-4626 at 10/19/11 9:56 PM: - Interesting finding. WritableByteArrayComparable should be enhanced with: {code} public int compareTo(byte[] buffer, int offset, int length) {code} which would delegate to this method in Bytes: {code} public static int compareTo(byte[] buffer1, int offset1, int length1, byte[] buffer2, int offset2, int length2) { {code} was (Author: yuzhih...@gmail.com): Interesting finding. WritableByteArrayComparable should be enhanced with: {code} public int compareTo(byte[] buffer, int offset, int length) {code} Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4628) Enhance Table Create Presplit Functionality within the HBase Shell
Enhance Table Create Presplit Functionality within the HBase Shell -- Key: HBASE-4628 URL: https://issues.apache.org/jira/browse/HBASE-4628 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Currently, we allow the user to presplit in the HBase shell by explicitly listing the startkey of all the region shards that they want. Instead, we should provide the RegionSplitter functionality of choosing a split algorithm, followed by the number of splits that they want. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4508) Backport HBASE-3777 to 0.90 branch
[ https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131052#comment-13131052 ] jirapos...@reviews.apache.org commented on HBASE-4508: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2448/ --- Review request for hbase, Ted Yu and Bright Fulton. Summary --- A backport of HBASE-3777 to the 0.90 branch, preserving 0.90 connection-per-config behavior by introducing the (default true) hbase.connection.per.config config property. This addresses bug HBASE-4508. https://issues.apache.org/jira/browse/HBASE-4508 Diffs - src/main/java/org/apache/hadoop/hbase/HConstants.java 611b149 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 64cabdf src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java aabf136 src/main/java/org/apache/hadoop/hbase/client/HConnection.java ed2f554 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java e6d1583 src/main/java/org/apache/hadoop/hbase/client/HTable.java cd5f167 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 78cb3d9 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java cfd86c9 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 686813e src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 88a5c20 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 923f891 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 72ba8dc src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 85cbb9e src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 06ac90b src/main/java/org/apache/hadoop/hbase/util/HMerge.java d727aa5 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java ec9f029 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java f277e6a src/main/resources/hbase-default.xml de2e91f src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 910d68f src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/2448/diff Testing --- Thanks, Bright Backport HBASE-3777 to 0.90 branch -- Key: HBASE-4508 URL: https://issues.apache.org/jira/browse/HBASE-4508 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Bright Fulton Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, HBASE-4508.v3.patch, HBASE-4508.v4.patch See discussion here: http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90subj=backporting+HBASE+3777+to+0+90 Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution. They have 10 RS nodes , 1 Master and 1 Zookeeper Live writes and reads but super heavy on reads. Cache hit is pretty high. The qps on one of their data centers is 50K. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131082#comment-13131082 ] jirapos...@reviews.apache.org commented on HBASE-4536: -- bq. On 2011-10-19 21:16:42, Prakash Khemani wrote: bq. Thanks for the review Prakash. bq. On 2011-10-19 21:16:42, Prakash Khemani wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 160 bq. https://reviews.apache.org/r/2178/diff/12/?file=50948#file50948line160 bq. bq. If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well? bq. bq. (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?) bq. bq. Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell? Gets and Scans only see deleted rows when they use a time range that ends before the resp. delete marker. The idea is that by using timerange [0,T+1), you can query the system for the state it was in at time T. If that range includes the delete marker you won't see the deleted rows, if it does not you will. bq. On 2011-10-19 21:16:42, Prakash Khemani wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java, line 114 bq. https://reviews.apache.org/r/2178/diff/12/?file=50953#file50953line114 bq. bq. Why only ExplicitColumnTracker? A delete family marker that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right? During compactions the ScanWildcardColumnTracker is used, always. For the new raw scans I simply do not want to support ExplicitColumnTracker. There's a check that prevents a raw scan from adding any columns. Note that a raw scan is not the same as a time range scan/get. Jon G suggested that'd be a good thing to have, and it was easy to add. A raw scan sees everything. Including the delete markers, and could be used (for example) to do custom replication which includes delete markers and deleted rows. ScanWildcardColumnTracker has all the new logic to deal with delete markers and deleted rows. bq. On 2011-10-19 21:16:42, Prakash Khemani wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 274 bq. https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line274 bq. bq. Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier. Delete marker should still be subject to this, so this is ok. bq. On 2011-10-19 21:16:42, Prakash Khemani wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 297 bq. https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line297 bq. bq. If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()? no, because retainDeletesInOutput is only ever true for: a minor compaction, a memstore flush, and a raw scan. In all cases no columns can be specified. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2677 --- On 2011-10-18 21:43:38, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2178/ bq. --- bq. bq. (Updated 2011-10-18 21:43:38) bq. bq. bq. Review request for hbase, Ted Yu and Jonathan Gray. bq. bq. bq. Summary bq. --- bq. bq. HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. bq. This did not work for deletes, however. Deletes would always mask all puts in the past. bq. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. bq. These rows are still subject to TTL and/or VERSIONS. bq. bq. This changes the following: bq. 1. There is a new flag on HColumnDescriptor enabling that behavior. bq. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. bq. 3. Do not unconditionally collect all deleted rows during a compaction. bq. 4. Allow a raw Scan, which retrieves all delete markers and deleted rows. bq. bq. The change is
[jira] [Commented] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131086#comment-13131086 ] Lars Hofhansl commented on HBASE-4626: -- Exactly! There are a bunch subclasses of WritableByteArrayComparable to deal with, so I refactored some code slightly. The only funky thing is the NullComparator. All other comparators would fail with an NPE if a null byte[] is passed (even before my change); so for this one comparator I have the compareTo(byte[],int,int[]) method throw an unsupportedOperationException. I have a patch. Running tests now. Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4508) Backport HBASE-3777 to 0.90 branch
[ https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4508: - Fix Version/s: 0.90.5 Backport HBASE-3777 to 0.90 branch -- Key: HBASE-4508 URL: https://issues.apache.org/jira/browse/HBASE-4508 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Bright Fulton Fix For: 0.90.5 Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, HBASE-4508.v3.patch, HBASE-4508.v4.patch See discussion here: http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90subj=backporting+HBASE+3777+to+0+90 Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution. They have 10 RS nodes , 1 Master and 1 Zookeeper Live writes and reads but super heavy on reads. Cache hit is pretty high. The qps on one of their data centers is 50K. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131121#comment-13131121 ] jirapos...@reviews.apache.org commented on HBASE-4536: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2684 --- Very awesome stuff, Lars. I like the new scan info/enum stuff and lots of good comments. The code in SQM.match() is a bit scary but you've done a nice job at keeping it clean and documenting the heck out of it. I was able to grok it, mostly, in not much time. It would be good to have some documentation somewhere about how exactly this is used. The raw scanner is pretty sweet but it's not exactly clear to me how i set those params without reading code (and converting from the logic in code back to the configs available is tricky). I'm +1 (trunk). But maybe wait for someone else who has reviewed this more for the final go ahead. - Jonathan On 2011-10-18 21:43:38, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2178/ bq. --- bq. bq. (Updated 2011-10-18 21:43:38) bq. bq. bq. Review request for hbase, Ted Yu and Jonathan Gray. bq. bq. bq. Summary bq. --- bq. bq. HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. bq. This did not work for deletes, however. Deletes would always mask all puts in the past. bq. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. bq. These rows are still subject to TTL and/or VERSIONS. bq. bq. This changes the following: bq. 1. There is a new flag on HColumnDescriptor enabling that behavior. bq. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. bq. 3. Do not unconditionally collect all deleted rows during a compaction. bq. 4. Allow a raw Scan, which retrieves all delete markers and deleted rows. bq. bq. The change is small'ish, but the logic is intricate, so please review carefully. bq. bq. bq. This addresses bug HBASE-4536. bq. https://issues.apache.org/jira/browse/HBASE-4536 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 bq.
[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131128#comment-13131128 ] jirapos...@reviews.apache.org commented on HBASE-4070: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2029/#review2683 --- src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java https://reviews.apache.org/r/2029/#comment6060 It's a bit complicated; to get the coprocessors for a particular regionserver R, you do: 1. HBaseAdmin object.getClusterStatus() = ClusterStatus object 2. ClusterStatus object.getServerInfo() = setHServerLoad 3. Find R amongst the set returned in step 2 (actually ClusterStatus stores the servers as a map keyed on the server name, so if you know the server's name, you don't need to iterate through the set). 3. HServerLoad object.getCoprocessors() = String[]. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java https://reviews.apache.org/r/2029/#comment6061 We could, but RSStatusTmpl doesn't use HBaseAdmin, but rather HRegionServer as the source of all of the information that it displays. - Eugene On 2011-10-16 16:22:16, Eugene Koontz wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2029/ bq. --- bq. bq. (Updated 2011-10-16 16:22:16) bq. bq. bq. Review request for hbase and Mingjie Lai. bq. bq. bq. Summary bq. --- bq. bq. Proposed fix for HBASE-4070. bq. bq. bq. This addresses bug HBASE-4070. bq. https://issues.apache.org/jira/browse/HBASE-4070 bq. bq. bq. Diffs bq. - bq. bq.src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon abeb850 bq.src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon be6fceb bq.src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 01bc1dd bq.src/main/java/org/apache/hadoop/hbase/HServerLoad.java 0c680e4 bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 92c959c bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 7d2f82e bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 50b49a6 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2e694a bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java eda5a9b bq. bq. Diff: https://reviews.apache.org/r/2029/diff bq. bq. bq. Testing bq. --- bq. bq. Two new tests : testRegionServerCoprocessorReported() and testMasterServerCoprocessorsReported() added to (existing) src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java. bq. bq. bq. Thanks, bq. bq. Eugene bq. bq. [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, master-web-ui.jpg, rs-status-web-ui.jpg HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4486: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Nice. Committed to 0.92 branch and trunk. Thanks for the patch Akash. Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Fix For: 0.92.0 Attachments: HBase-4486-v2.patch, HBase-4486-v3.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4629) enable automated patch testing for hbase
enable automated patch testing for hbase Key: HBASE-4629 URL: https://issues.apache.org/jira/browse/HBASE-4629 Project: HBase Issue Type: New Feature Reporter: Giridharan Kesavan enable jenkins automated patch testing for hbase project -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4629) enable automated patch testing for hbase
[ https://issues.apache.org/jira/browse/HBASE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131142#comment-13131142 ] Giridharan Kesavan commented on HBASE-4629: --- Can someone with jira Admin privilege please assign this jira to me? enable automated patch testing for hbase Key: HBASE-4629 URL: https://issues.apache.org/jira/browse/HBASE-4629 Project: HBase Issue Type: New Feature Reporter: Giridharan Kesavan enable jenkins automated patch testing for hbase project -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4629) enable automated patch testing for hbase
[ https://issues.apache.org/jira/browse/HBASE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HBASE-4629: -- Assignee: Giridharan Kesavan enable automated patch testing for hbase Key: HBASE-4629 URL: https://issues.apache.org/jira/browse/HBASE-4629 Project: HBase Issue Type: New Feature Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan enable jenkins automated patch testing for hbase project -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131162#comment-13131162 ] Li Pi commented on HBASE-4430: -- This new patch synchronizes things a bit at the SingleSizeCache level. I do leave the copy to Slab, along with reads, unsynchronized, so it should give a performance advantage over Todd's super simplified patch. If anyone was wondering, the race that occurred with the last patch worked like this. Steady State: Item A is cached. Thread B: Starts to cache something, starts eviction of Item A, evictionlistener is called, but has not been completed it. Thread A: begins to recache item A, notices an entry for A already. Thread B: gets around to evicting the entry for A, as it finishes eviction, wakes all the sleeping threads up so they have a chance to recache. Thread A: continues onwards, sleeps the thread, and waits for something to wake it up. But B has already finished, and called notifier.notifyAll(); End result: A ends up sleeping indefinitely. Nothing notifies it. I got rid of the jumble in the end. This should make maintenance much easier. Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4430: - Status: Patch Available (was: Reopened) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4630) If you shutdown all RS an active master is never able to recover when RS come back online
If you shutdown all RS an active master is never able to recover when RS come back online - Key: HBASE-4630 URL: https://issues.apache.org/jira/browse/HBASE-4630 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Fix For: 0.92.1 I've been doing some isolated benchmarking of a single RS and can repeatedly trigger some craziness in the master if I shutdown the RS. It is never able to recover after bringing RSs back online. I seem to see different behavior across different branches / revisions of the 92 branch, but there does seem to be an issue in several of them. Putting against 0.92.1 so we don't hold up the release of 0.92. Should not be a blocker. Working on a unit test now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131168#comment-13131168 ] Lars Hofhansl commented on HBASE-4536: -- So we have three +1's now. Ted, Jonathan... and me :) Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4536-v15.txt Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4626: - Attachment: 4626.txt Patch against trunk. Tests still running... On scans touching many rows with a filter, this should significantly cut down garbage. Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 4626.txt Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131172#comment-13131172 ] Ted Yu commented on HBASE-4536: --- Reminder: document the (minVersions = maxVersions) check of HColumnDescriptor in the book. Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4536-v15.txt Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131176#comment-13131176 ] stack commented on HBASE-4430: -- You have to change name of the file SlabItemEvictionWatcher to SlabItemActionWatcher since you changed the name of the Interface it contains (and its strange have a file named other than what it contains). How comes we can get rid of the block of code SlabCache. Similar in onEviction? Good stuff Li. Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131182#comment-13131182 ] Li Pi commented on HBASE-4430: -- Right, forgot about that bit. We can get rid of the block of code in SlabCache since I simply decided to synchronize things - as todd suggested. This felt easier than porting heapsize/metrics over to Todd's proposed patch. I got tired of one race condition fix adding in another race condition. This was added in by 4482, which fixed one. Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4489) Better key splitting in RegionSplitter
[ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131183#comment-13131183 ] Nicolas Spiegelberg commented on HBASE-4489: Code-wise, I think everything looks fine. I'm going to make a minor adjustment to the unit tests prior to commit. Namely: we should not add unit tests and then put down @Ignore. Instead, remove these unit tests and add them back in with HBASE-4567. Better key splitting in RegionSplitter -- Key: HBASE-4489 URL: https://issues.apache.org/jira/browse/HBASE-4489 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Dave Revell Assignee: Dave Revell Fix For: 0.90.5 Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, HBASE-4489-trunk-v5.patch The RegionSplitter utility allows users to create a pre-split table from the command line or do a rolling split on an existing table. It supports pluggable split algorithms that implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes keys fall in the range from ASCII string to ASCII string 7FFF. This is not a sane default, and seems useless to most users. Users are likely to be surprised by the fact that all the region splits occur in in the byte range of ASCII characters. A better default split algorithm would be one that evenly divides the space of all bytes, which is what this patch does. Making a table with five regions would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and \xFF\xFF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4459) HbaseObjectWritable code is a byte, we will eventually run out of codes
[ https://issues.apache.org/jira/browse/HBASE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131185#comment-13131185 ] stack commented on HBASE-4459: -- +1 HbaseObjectWritable code is a byte, we will eventually run out of codes --- Key: HBASE-4459 URL: https://issues.apache.org/jira/browse/HBASE-4459 Project: HBase Issue Type: Bug Components: io Reporter: Jonathan Gray Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.92.0 Attachments: 4459-v2.patch.txt, 4459.txt There are about 90 classes/codes in HbaseObjectWritable currently and Byte.MAX_VALUE is 127. In addition, anyone wanting to add custom classes but not break compatibility might want to leave a gap before using codes and that's difficult in such limited space. Eventually we should get rid of this pattern that makes compatibility difficult (better client/server protocol handshake) but we should probably at least bump this to a short for 0.94. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4430: - Attachment: HBase-4430v2.txt Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, HBase-4430v2.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131188#comment-13131188 ] Jonathan Gray commented on HBASE-4536: -- I'm at +0.5 Add just a bit more high-level, config-level doc somewhere and I'm a strong +1... :) Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4536-v15.txt Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131189#comment-13131189 ] Ted Yu commented on HBASE-4430: --- Minor comment, in SlabItemActionWatcher.java: {code} * This is called as a callback when an item is removed from a SingleSizeCache {code} I think it should read ' from a SlabCache' Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, HBase-4430v2.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter
[ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131190#comment-13131190 ] jirapos...@reviews.apache.org commented on HBASE-4532: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2393/ --- (Updated 2011-10-20 00:08:14.459108) Review request for hbase, Dhruba Borthakur, Michael Stack, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin Tang, Karthik Ranganathan, and Nicolas Spiegelberg. Changes --- Thanks for Kannan's review. Update the diff to address Kannan's comments. Summary --- HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter only for delete family The only subtle use case is when we are interested in the top row with empty column. For example, we are interested in row1/cf1:/1/put. So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter will say there is NO delete family. Then it will avoid the top row seek and return a fake kv, which is the last kv for this row (createLastOnRowCol). In this way, we have already missed the real kv we are interested in. The solution for the above problem is to disable this optimization if we are trying to GET/SCAN a row with empty column. This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as well. I will submit the patch for apache-trunk later. This addresses bug HBASE-4532. https://issues.apache.org/jira/browse/HBASE-4532 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 6cf7cce src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 92070b3 src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java ebb360c src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java b8bcc65 src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 48e9163 src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 Diff: https://reviews.apache.org/r/2393/diff Testing --- Running all the unit tests now Thanks, Liyin Avoid top row seek by dedicated bloom filter for delete family bloom filter --- Key: HBASE-4532 URL: https://issues.apache.org/jira/browse/HBASE-4532 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D27.1.patch, D27.1.patch HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter only for delete family The only subtle use case is when we are interested in the top row with empty column. For example, we are interested in row1/cf1:/1/put. So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter will say there is NO delete family. Then it will avoid the top row seek and return a fake kv, which is the last kv for this row (createLastOnRowCol). In this way, we have already missed the real kv we are interested in. The solution for the above problem is to disable this optimization if we are trying to GET/SCAN a row with empty column. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter
[ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131191#comment-13131191 ] jirapos...@reviews.apache.org commented on HBASE-4532: -- bq. On 2011-10-19 19:02:46, Kannan Muthukkaruppan wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 1058 bq. https://reviews.apache.org/r/2393/diff/2/?file=50558#file50558line1058 bq. bq. not clear where you are using this -1 state Even if there is no delete family bloom filter, the Store file will still count how many delete family key values and append this information into HFile's File info. So when reading the file, we will know how many delete family kvs. However, if there is no this delete family field in the file info, deleteFamilyCnt shall be set to -1. So the function passesDeleteFamilyBloomFilter won't take this into account. bq. On 2011-10-19 19:02:46, Kannan Muthukkaruppan wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 1217 bq. https://reviews.apache.org/r/2393/diff/2/?file=50558#file50558line1217 bq. bq. To make sure I understand this... bq. bq. for HFileV1 case or for HFileV2 + but without this fix, I am guessing deleteFamilyCnt will be equal to -1, and the fact that it doesn't have a bloomFilter will cause it to return true. That look's fine. Just not obvious. Yes:) If there is a deleteFamilyCnt and the deleteFamilyCnt is 0, then there is no need to check Bloom filter and return false for function passesDeleteFamilyBloomFilter(). It means there is no need to seek this store file for delete family with the row. if the deleteFamilyCnt is not initialized properly for some reason, which is set to -1, then it needs to check the delete family bloom filter. So there is no delete family bloom filter, it will return true. It means it is possible that there is a delete family for this row. bq. On 2011-10-19 19:02:46, Kannan Muthukkaruppan wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java, line 238 bq. https://reviews.apache.org/r/2393/diff/2/?file=50559#file50559line238 bq. bq. In case there is a deleteFamily kv, there are two sub-cases here... bq. bq. a) we have ROWCOL bloom (in which case there is no DeleteFamilyBloomFilter) and we want to use the ROWCOL bloom filter itself. bq. bq. b) we have a DeleteFamilyBloomFilter. bq. bq. I don't see us taking advantage of (a) like we used to earlier. Isn't this a regression for the ROWCOL bloom case? And if so, TestBlocksRead should have caught it, no? 1) Yes, it should the ROWCOL Bloom filter. It can also help to warm up row col bloom filter in the cache OR get benefit from block cache. I will update the code. 2) There is no regression for the ROWCOL bloom case. It is because we only count for data block seek number. No matter which bloom filter (delete family or row col), it will return the same result. So it won't affect the decision whether to seek to the store file file or not. Please correct me if I am wrong :) bq. On 2011-10-19 19:02:46, Kannan Muthukkaruppan wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java, line 111 bq. https://reviews.apache.org/r/2393/diff/2/?file=50560#file50560line111 bq. bq. isSeekToEmptyColumn and useBloom should be separate flags I think. bq. bq. For example, if the CF had ROWCOL bloom, and the query for looking for row/0-length column, then with this change, we won't use the ROWCOL bloom filter even when it exists. bq. bq. Isn't it the case that we want to avoid using only the deleteFamilyBloom filter when isSeekToEmptyColumn is true? Agree:) I will update the code to pass the scan query matcher to each store file scanner. Also this will help us for further optimization. When the store file scanner has more information about the matcher's status, it may help to avoid more unnecessarily seeks. - Liyin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2393/#review2615 --- On 2011-10-20 00:08:14, Liyin Tang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2393/ bq. --- bq. bq. (Updated 2011-10-20 00:08:14) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin Tang, Karthik Ranganathan, and Nicolas Spiegelberg. bq. bq. bq. Summary bq. --- bq. bq.
[jira] [Commented] (HBASE-4630) If you shutdown all RS an active master is never able to recover when RS come back online
[ https://issues.apache.org/jira/browse/HBASE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131192#comment-13131192 ] Ted Yu commented on HBASE-4630: --- Could this issue be HBASE-4397 ? If you shutdown all RS an active master is never able to recover when RS come back online - Key: HBASE-4630 URL: https://issues.apache.org/jira/browse/HBASE-4630 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Fix For: 0.92.1 I've been doing some isolated benchmarking of a single RS and can repeatedly trigger some craziness in the master if I shutdown the RS. It is never able to recover after bringing RSs back online. I seem to see different behavior across different branches / revisions of the 92 branch, but there does seem to be an issue in several of them. Putting against 0.92.1 so we don't hold up the release of 0.92. Should not be a blocker. Working on a unit test now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter
[ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg updated HBASE-4489: --- Resolution: Fixed Fix Version/s: (was: 0.90.5) 0.94.0 Status: Resolved (was: Patch Available) putting in 0.94 since it is an improvement and we don't plan on a long release cycle between 0.92 and 0.94 Better key splitting in RegionSplitter -- Key: HBASE-4489 URL: https://issues.apache.org/jira/browse/HBASE-4489 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Dave Revell Assignee: Dave Revell Fix For: 0.94.0 Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, HBASE-4489-trunk-v5.patch The RegionSplitter utility allows users to create a pre-split table from the command line or do a rolling split on an existing table. It supports pluggable split algorithms that implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes keys fall in the range from ASCII string to ASCII string 7FFF. This is not a sane default, and seems useless to most users. Users are likely to be surprised by the fact that all the region splits occur in in the byte range of ASCII characters. A better default split algorithm would be one that evenly divides the space of all bytes, which is what this patch does. Making a table with five regions would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and \xFF\xFF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4629) enable automated patch testing for hbase
[ https://issues.apache.org/jira/browse/HBASE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131195#comment-13131195 ] Ted Yu commented on HBASE-4629: --- I think Li Pi's patch for HBASE-4430 should be the first to try this feature out. FYI: Apache Jenkins builds on Ubuntu boxes. enable automated patch testing for hbase Key: HBASE-4629 URL: https://issues.apache.org/jira/browse/HBASE-4629 Project: HBase Issue Type: New Feature Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan enable jenkins automated patch testing for hbase project -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131196#comment-13131196 ] Lars Hofhansl commented on HBASE-4536: -- @Ted, ah yes... Will do. @Jonathan, you mean in the HBase book? Sure. I'll add that and post a new patch. Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4536-v15.txt Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4626: - Component/s: regionserver Fix Version/s: 0.94.0 Targeting 0.94. But this is simple and could be put into 0.92 as well. Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Components: regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4626.txt Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131202#comment-13131202 ] Ted Yu commented on HBASE-4430: --- Please refresh the javadoc: {code} +/** + * Interface for objects that want to know when an eviction occurs. + * */ +interface SlabItemActionWatcher { {code} Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, HBase-4430v2.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131203#comment-13131203 ] Lars Hofhansl commented on HBASE-4626: -- Also there are many more filters that do this kind of thing (all subclasses of CompareFilter). All fixed now. Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Components: regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4626.txt Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass
[ https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4430: - Attachment: HBase-4430v3.txt fixed comments Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass --- Key: HBASE-4430 URL: https://issues.apache.org/jira/browse/HBASE-4430 Project: HBase Issue Type: Task Components: test Reporter: stack Assignee: Li Pi Fix For: 0.92.0 Attachments: HBase-4430.txt, HBase-4430v2.txt, HBase-4430v3.txt, TestSlabCache.trace -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4626) Filters unnecessarily copy byte arrays...
[ https://issues.apache.org/jira/browse/HBASE-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131207#comment-13131207 ] Ted Yu commented on HBASE-4626: --- Nice work. Please fill in javadoc for return value: {code} + * @return + */ + public abstract int compareTo(byte [] value, int offset, int length); {code} Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Components: regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4626.txt Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4630) If you shutdown all RS an active master is never able to recover when RS come back online
[ https://issues.apache.org/jira/browse/HBASE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131210#comment-13131210 ] Jonathan Gray commented on HBASE-4630: -- The stuff I'm seeing in the logs is different but it's probably the same or a related issue. I'm going to try and dig on this and will figure out whether to close this as a dupe or not. Thanks for the pointer, Ted. If you shutdown all RS an active master is never able to recover when RS come back online - Key: HBASE-4630 URL: https://issues.apache.org/jira/browse/HBASE-4630 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Fix For: 0.92.1 I've been doing some isolated benchmarking of a single RS and can repeatedly trigger some craziness in the master if I shutdown the RS. It is never able to recover after bringing RSs back online. I seem to see different behavior across different branches / revisions of the 92 branch, but there does seem to be an issue in several of them. Putting against 0.92.1 so we don't hold up the release of 0.92. Should not be a blocker. Working on a unit test now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4621) TestAvroServer fails quite often intermittently
[ https://issues.apache.org/jira/browse/HBASE-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131214#comment-13131214 ] Hudson commented on HBASE-4621: --- Integrated in HBase-TRUNK #2342 (See [https://builds.apache.org/job/HBase-TRUNK/2342/]) HBASE-4621 TestAvroServer fails quite often intermittently stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/avro/TestAvroServer.java TestAvroServer fails quite often intermittently --- Key: HBASE-4621 URL: https://issues.apache.org/jira/browse/HBASE-4621 Project: HBase Issue Type: Bug Reporter: Akash Ashok Assignee: Akash Ashok Fix For: 0.92.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer
[ https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131225#comment-13131225 ] jirapos...@reviews.apache.org commented on HBASE-4460: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/#review2689 --- Looks good to me, one minor nit... +1 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java https://reviews.apache.org/r/2410/#comment6066 Just a code-readability comment - could we structure as: Get get = new Get(row); get.setTimeRange(Long.MIN_VALUE, timestamp); if (columns != null) { // ... } Result result = rs.get(regionName, get); return ThriftUtilities.rowResultFromHBase(result); - Karthik On 2011-10-17 22:37:43, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2410/ bq. --- bq. bq. (Updated 2011-10-17 22:37:43) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. bq. bq. bq. Summary bq. --- bq. bq. Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. bq. bq. bq. This addresses bug HBASE-4460. bq. https://issues.apache.org/jira/browse/HBASE-4460 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2410/diff bq. bq. bq. Testing bq. --- bq. bq. Running this already on our hbase-92-based branch and running test site. bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. Support running an embedded ThriftServer within a RegionServer -- Key: HBASE-4460 URL: https://issues.apache.org/jira/browse/HBASE-4460 Project: HBase Issue Type: New Feature Components: regionserver, thrift Reporter: Jonathan Gray Assignee: Jonathan Gray Attachments: HBASE-4460-v1.patch Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This JIRA is just about the embedded ThriftServer. Will open others for the rest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer
[ https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131226#comment-13131226 ] jirapos...@reviews.apache.org commented on HBASE-4460: -- bq. On 2011-10-20 00:34:31, Karthik Ranganathan wrote: bq. Looks good to me, one minor nit... +1 Thanks for reviews guys. I'm going to file a follow-up JIRA to deal with your comments (cleanup and optimize). - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/#review2689 --- On 2011-10-17 22:37:43, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2410/ bq. --- bq. bq. (Updated 2011-10-17 22:37:43) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. bq. bq. bq. Summary bq. --- bq. bq. Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. bq. bq. bq. This addresses bug HBASE-4460. bq. https://issues.apache.org/jira/browse/HBASE-4460 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2410/diff bq. bq. bq. Testing bq. --- bq. bq. Running this already on our hbase-92-based branch and running test site. bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. Support running an embedded ThriftServer within a RegionServer -- Key: HBASE-4460 URL: https://issues.apache.org/jira/browse/HBASE-4460 Project: HBase Issue Type: New Feature Components: regionserver, thrift Reporter: Jonathan Gray Assignee: Jonathan Gray Attachments: HBASE-4460-v1.patch Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This JIRA is just about the embedded ThriftServer. Will open others for the rest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4631) Cleanup and optimize embedded HRegionThriftServer
Cleanup and optimize embedded HRegionThriftServer - Key: HBASE-4631 URL: https://issues.apache.org/jira/browse/HBASE-4631 Project: HBase Issue Type: Improvement Components: regionserver, thrift Affects Versions: 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.94.0 There were some good comments in the review of HBASE-4460. Cleanup the code and look at potential optimizations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer
[ https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray resolved HBASE-4460. -- Resolution: Fixed Release Note: Run a ThriftServer embedded within a RegionServer process by setting hbase.regionserver.export.thrift to true. Hadoop Flags: Reviewed Committed to trunk. Thanks for the reviews LarsH and KarthikR. Opened HBASE-4631 to implement some of your suggestions. Support running an embedded ThriftServer within a RegionServer -- Key: HBASE-4460 URL: https://issues.apache.org/jira/browse/HBASE-4460 Project: HBase Issue Type: New Feature Components: regionserver, thrift Reporter: Jonathan Gray Assignee: Jonathan Gray Attachments: HBASE-4460-v1.patch Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This JIRA is just about the embedded ThriftServer. Will open others for the rest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer
[ https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-4460: - Fix Version/s: 0.94.0 New feature, only going to trunk. Support running an embedded ThriftServer within a RegionServer -- Key: HBASE-4460 URL: https://issues.apache.org/jira/browse/HBASE-4460 Project: HBase Issue Type: New Feature Components: regionserver, thrift Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.94.0 Attachments: HBASE-4460-v1.patch Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This JIRA is just about the embedded ThriftServer. Will open others for the rest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4632) Offline meta rebuild should check if hbase is online before acting.
Offline meta rebuild should check if hbase is online before acting. --- Key: HBASE-4632 URL: https://issues.apache.org/jira/browse/HBASE-4632 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Filing this issue because this patch seems to work on the 0.90 branch tests but seems to timeout on the 0.92/trunk branches. Depends on HBASE-4337 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1183) New MR splitting algorithm and other new features need a way to split a key range in N chunks
[ https://issues.apache.org/jira/browse/HBASE-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131250#comment-13131250 ] Nicolas Spiegelberg commented on HBASE-1183: @jgray : just wondering what the {1,0} prepend in split() algorithm fixed? Looks like any integer conversion problems would be solved by issuing padTail() New MR splitting algorithm and other new features need a way to split a key range in N chunks - Key: HBASE-1183 URL: https://issues.apache.org/jira/browse/HBASE-1183 Project: HBase Issue Type: Improvement Components: util Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.20.0 Attachments: hbase-1183-v1.patch, hbase-1183-v2.patch, hbase-1183-v3.patch, hbase-1183-v4.patch For HBASE-1172 and other functionality coming soon, we need to be able to take a [start,stop) range and divide it into chunks. For example, we have 10 regions but want to run 30 maps. We need to divide each region into three key ranges for the start/stop of each scanner. Implementing using java.math.BigInteger Will also include a couple additional helpers in Bytes to make life easy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1183) New MR splitting algorithm and other new features need a way to split a key range in N chunks
[ https://issues.apache.org/jira/browse/HBASE-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131261#comment-13131261 ] Jonathan Gray commented on HBASE-1183: -- Wow, really got me thinking back. I honestly don't remember exactly why. We convert them to BigInteger so we can do: (stop - start) / numsplits = interval Something related to signed/unsigned? Reading the code it does seem okay. Good thing I didn't write a unit test. Are you seeing that it's broken in some way? I can spend a little more time looking at it if necessary. New MR splitting algorithm and other new features need a way to split a key range in N chunks - Key: HBASE-1183 URL: https://issues.apache.org/jira/browse/HBASE-1183 Project: HBase Issue Type: Improvement Components: util Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.20.0 Attachments: hbase-1183-v1.patch, hbase-1183-v2.patch, hbase-1183-v3.patch, hbase-1183-v4.patch For HBASE-1172 and other functionality coming soon, we need to be able to take a [start,stop) range and divide it into chunks. For example, we have 10 regions but want to run 30 maps. We need to divide each region into three key ranges for the start/stop of each scanner. Implementing using java.math.BigInteger Will also include a couple additional helpers in Bytes to make life easy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1183) New MR splitting algorithm and other new features need a way to split a key range in N chunks
[ https://issues.apache.org/jira/browse/HBASE-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131264#comment-13131264 ] Jonathan Gray commented on HBASE-1183: -- To clarify, I meant that the code seems like you don't need to prepend the {1,0} but I have some vague memory of needing it. New MR splitting algorithm and other new features need a way to split a key range in N chunks - Key: HBASE-1183 URL: https://issues.apache.org/jira/browse/HBASE-1183 Project: HBase Issue Type: Improvement Components: util Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.20.0 Attachments: hbase-1183-v1.patch, hbase-1183-v2.patch, hbase-1183-v3.patch, hbase-1183-v4.patch For HBASE-1172 and other functionality coming soon, we need to be able to take a [start,stop) range and divide it into chunks. For example, we have 10 regions but want to run 30 maps. We need to divide each region into three key ranges for the start/stop of each scanner. Implementing using java.math.BigInteger Will also include a couple additional helpers in Bytes to make life easy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4633) Potential memory leak in client RPC timeout mechanism
Potential memory leak in client RPC timeout mechanism - Key: HBASE-4633 URL: https://issues.apache.org/jira/browse/HBASE-4633 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.3 Environment: HBase version: 0.90.3 + Patches , Hadoop version: CDH3u0 Reporter: Shrijeet Paliwal Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937, https://issues.apache.org/jira/browse/HBASE-4003 We have been using the 'hbase.client.operation.timeout' knob introduced in 2937 for quite some time now. It helps us enforce SLA. We have two HBase clusters and two HBase client clusters. One of them is much busier than the other. We have seen a deterministic behavior of clients running in busy cluster. Their (client's) memory footprint increases consistently after they have been up for roughly 24 hours. This memory footprint almost doubles from its usual value (usual case == RPC timeout disabled). After much investigation nothing concrete came out and we had to put a hack which keep heap size in control even when RPC timeout is enabled. Also note , the same behavior is not observed in 'not so busy cluster. The patch is here : https://gist.github.com/1288023 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131275#comment-13131275 ] Hudson commented on HBASE-4486: --- Integrated in HBase-TRUNK #2343 (See [https://builds.apache.org/job/HBase-TRUNK/2343/]) HBASE-4486 Improve Javadoc for HTableDescriptor stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Fix For: 0.92.0 Attachments: HBase-4486-v2.patch, HBase-4486-v3.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4489) Better key splitting in RegionSplitter
[ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131274#comment-13131274 ] Hudson commented on HBASE-4489: --- Integrated in HBase-TRUNK #2343 (See [https://builds.apache.org/job/HBase-TRUNK/2343/]) HBASE-4489 Better key splitting in RegionSplitter nspiegelberg : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java Better key splitting in RegionSplitter -- Key: HBASE-4489 URL: https://issues.apache.org/jira/browse/HBASE-4489 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Dave Revell Assignee: Dave Revell Fix For: 0.94.0 Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, HBASE-4489-trunk-v5.patch The RegionSplitter utility allows users to create a pre-split table from the command line or do a rolling split on an existing table. It supports pluggable split algorithms that implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes keys fall in the range from ASCII string to ASCII string 7FFF. This is not a sane default, and seems useless to most users. Users are likely to be surprised by the fact that all the region splits occur in in the byte range of ASCII characters. A better default split algorithm would be one that evenly divides the space of all bytes, which is what this patch does. Making a table with five regions would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and \xFF\xFF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131286#comment-13131286 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 302 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line302 bq. bq. Naming rd as rootdir would make the code more readable. done bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 446 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line446 bq. bq. I think LOG.info() should be used here. I think it is still a problem, but we are in an ok state. I've changed it to 'warn' instead. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 276 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line276 bq. bq. Minor suggestion: IOException may occur more than once. Would logging all such IOException's before bailing out make user experience better ? bq. Basically we just need to track the last such IOException in a variable and bail out at line 283 if the variable isn't null. Updated to track all IOE's and throw MultipleIOException. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 346 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line346 bq. bq. I think rebuildMeta() should check the return value from generatePuts(). bq. Otherwise we would encounter NPE at line 405 below. see below bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 407 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line407 bq. bq. false should be returned if puts is null. So I believe that checkHdfs and loadTableInfo and the error checking happens before and bails out after suggestFixes(). But sure, it doesn't really hurt here to be event more defensive. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 378 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line378 bq. bq. Do you plan to add this logic in another JIRA ? I have a patch that adds this but it is having problems on the trunk side. I'd like to get this in first and then then we'll deal with that next. New issue filed HBASE-4632. - jmhsieh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/#review2440 --- On 2011-10-07 19:04:44, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2287/ bq. --- bq. bq. (Updated 2011-10-07 19:04:44) bq. bq. bq. Review request for hbase and Ted Yu. bq. bq. bq. Summary bq. --- bq. bq. Backport to 0.90 bq. bq. commit 89862b73c6358e27220b87b0362599d86ab0fe4a bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 bq.src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2287/diff bq. bq. bq. Testing bq. --- bq. bq. Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). bq. bq. This version
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131301#comment-13131301 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- (Updated 2011-10-20 03:21:33.922683) Review request for hbase and Ted Yu. Changes --- Addressed comments * added more logging and better error message * Handled exit properly. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131302#comment-13131302 ] Jonathan Hsieh commented on HBASE-4377: --- 0.90 version requires HBASE-4508 [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4377: -- Attachment: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131303#comment-13131303 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-20 03:22:39.708313) Review request for hbase, Michael Stack and Andrew Purtell. Changes --- Ported updates from comments from 0.90 branch to trunk/0.92 branch. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 7409c9c src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131330#comment-13131330 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- (Updated 2011-10-20 03:46:53.527042) Review request for hbase and Ted Yu. Changes --- New version for 0.90 that does not require HBASE-3777 / HBASE-4508. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter
[ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131331#comment-13131331 ] jirapos...@reviews.apache.org commented on HBASE-4532: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2393/ --- (Updated 2011-10-20 03:46:26.190655) Review request for hbase, Dhruba Borthakur, Michael Stack, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin Tang, Karthik Ranganathan, and Nicolas Spiegelberg. Summary --- HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter only for delete family The only subtle use case is when we are interested in the top row with empty column. For example, we are interested in row1/cf1:/1/put. So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter will say there is NO delete family. Then it will avoid the top row seek and return a fake kv, which is the last kv for this row (createLastOnRowCol). In this way, we have already missed the real kv we are interested in. The solution for the above problem is to disable this optimization if we are trying to GET/SCAN a row with empty column. This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as well. I will submit the patch for apache-trunk later. This addresses bug HBASE-4532. https://issues.apache.org/jira/browse/HBASE-4532 Diffs - src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 6cf7cce src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 92070b3 src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java ebb360c src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java b8bcc65 src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 48e9163 src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 Diff: https://reviews.apache.org/r/2393/diff Testing (updated) --- Passed all the unit tests Thanks, Liyin Avoid top row seek by dedicated bloom filter for delete family bloom filter --- Key: HBASE-4532 URL: https://issues.apache.org/jira/browse/HBASE-4532 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D27.1.patch, D27.1.patch HBASE-4469 avoids the top row seek operation if row-col bloom filter is enabled. This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter only for delete family The only subtle use case is when we are interested in the top row with empty column. For example, we are interested in row1/cf1:/1/put. So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter will say there is NO delete family. Then it will avoid the top row seek and return a fake kv, which is the last kv for this row (createLastOnRowCol). In this way, we have already missed the real kv we are interested in. The solution for the above problem is to disable this optimization if we are trying to GET/SCAN a row with empty column. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4377: -- Attachment: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch 0.90 v4 works without having HBASE-4508/HBASE-3777 on the 0.90 branch. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira