date:20111015

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128355#comment-13128355
 ] 

Lars Hofhansl commented on HBASE-4536:
--

Fair enough. Was picking up on Stack's suggestion to have this on by default. 
Just means the code has to distinguish between minor and major compaction 
scans, raw scans, and normal user scans, all of which can be for a store with 
keep_delete_cells enabled or not.

Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That 
would capture all immutable scan-relevant information about the Store (min/max 
version, family name, ttl, keep_deletes, comparator). (A MatcherConfig with all 
information would need to be mutable and created or changed for every scan.)
And then maybe a ScanType enum to distinguish between compaction scans and user 
initiated scans.

What about Stack's suggested in the review to include delete cells in the 
version count? (The only strange part with that is that the family markers are 
*always* in the beginning).
Right now a delete cell does not increase the version count and instead 
"inherits" the version of the last put.


> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128350#comment-13128350
 ] 

Lars Hofhansl commented on HBASE-4563:
--

+1 on patches (pending all tests pass and Ted's suggested formatting fix).

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128348#comment-13128348
 ] 

Lars Hofhansl commented on HBASE-4562:
--

Are the attached patches the full story?
I see only PONR moved, but no code to abort the server if timing out against 
.META. Where is that code?

In the trunk patch the comment seems to be mangled. The big PONR comment be 
moved above the comment speculating the that PONR should be moved here, just 
like in the 0.90 and 0.92 patches... 

In fact should that comment now be removed in all patches?


> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
> HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
> test-4562-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128327#comment-13128327
 ] 

Ted Yu commented on HBASE-4563:
---

Thanks for the detailed test report.
+1 on patches.
There is some formatting:
{code}
+   } finally {
+  this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
+   }
{code}
The right curly brace before finally should be moved left by 3 spaces.
Same with the closing right curly brace.

See HBASE-3678 for Formatter for Eclipse.

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128326#comment-13128326
 ] 

Ted Yu commented on HBASE-4562:
---

+1 on patches.

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
> HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
> test-4562-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128321#comment-13128321
 ] 

bluedavy commented on HBASE-4562:
-

I attached patches & test reports for 0.90.4,0.92 and trunk.

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
> HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
> test-4562-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: test-4563-trunk.txt
test-4563-0.92.txt
test-4563-0.90.txt
HBASE-4563-trunk.patch
HBASE-4563-0.92.patch
HBASE-4563-0.90.patch

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: (was: HBASE-4563for0.92.patch)

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: (was: HBASE-4563fortrunk.patch)

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: test-4562-trunk.txt
test-4562-0.92.txt
test-4562-0.90.txt
HBASE-4562-trunk.patch
HBASE-4562-0.92.patch
HBASE-4562-0.90.patch

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
> HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
> test-4562-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: (was: HBASE-4563.patch)

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: (was: HBASE-4563-test.report.txt)

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
> HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
> test-4563-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562.patch)

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562for0.92.patch)

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562fortrunk.patch)

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-15 Thread bluedavy (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562-test.report.txt)

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Jonathan Gray (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128317#comment-13128317
 ] 

Jonathan Gray commented on HBASE-4536:
--

bq. I think this new feature should not be the default behavior.

+1

> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128306#comment-13128306
 ] 

Ted Yu commented on HBASE-4536:
---

I think this new feature should not be the default behavior.

> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128305#comment-13128305
 ] 

Lars Hofhansl commented on HBASE-4536:
--

@Ted, I can change the order in which the checks are made. MIN_VERSIONS make no 
sense without TTL, thouhg.

@Ted and @Stack: If we indeed make all of this the default behavior a lot of 
the if statements and parameters in ScanQueryMatcher would go away, simplifying 
the logic a LOT.
That would be mean that every store would keep deleted rows (until VERSIONS, or 
TTL removes them), and scanners would always be able to peek behind delete 
markers.


> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-15 Thread Dave Revell (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Release Note: The split algorithm used by RegionSplitter is now a required 
parameter. Previously there was one split algorithm called MD5StringSplit, 
which was the default. MD5StringSplit has been renamed to HexStringSplit, and 
tweaked so that its maximum key is now "" instead of "7FFF. A new 
split algorithm UniformSplit has been added which treats keys as arbitrary 
bytes.

> Better key splitting in RegionSplitter
> --
>
> Key: HBASE-4489
> URL: https://issues.apache.org/jira/browse/HBASE-4489
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Dave Revell
>Assignee: Dave Revell
> Fix For: 0.90.5
>
> Attachments: HBASE-4489-branch0.90-v1.patch, 
> HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
> HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
> HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
> HBASE-4489-trunk-v5.patch
>
>
> The RegionSplitter utility allows users to create a pre-split table from the 
> command line or do a rolling split on an existing table. It supports 
> pluggable split algorithms that implement the SplitAlgorithm interface. The 
> only/default SplitAlgorithm is one that assumes keys fall in the range from 
> ASCII string "" to ASCII string "7FFF". This is not a sane 
> default, and seems useless to most users. Users are likely to be surprised by 
> the fact that all the region splits occur in in the byte range of ASCII 
> characters.
> A better default split algorithm would be one that evenly divides the space 
> of all bytes, which is what this patch does. Making a table with five regions 
> would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
> \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4597) [book] performance.xml Adding comment about EC2

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4597:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [book] performance.xml Adding comment about EC2
> ---
>
> Key: HBASE-4597
> URL: https://issues.apache.org/jira/browse/HBASE-4597
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: performance_HBASE_4597.xml.patch
>
>
> I added a section under performance reminding people that running HBase on 
> EC2 isn't the same thing as running on a dedicated server.
> This type of question seems to happen fairly often on the dist-list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4597) [book] performance.xml Adding comment about EC2

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4597:
-

Attachment: performance_HBASE_4597.xml.patch

> [book] performance.xml Adding comment about EC2
> ---
>
> Key: HBASE-4597
> URL: https://issues.apache.org/jira/browse/HBASE-4597
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: performance_HBASE_4597.xml.patch
>
>
> I added a section under performance reminding people that running HBase on 
> EC2 isn't the same thing as running on a dedicated server.
> This type of question seems to happen fairly often on the dist-list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4597) [book] performance.xml Adding comment about EC2

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4597:
-

Status: Patch Available  (was: Open)

> [book] performance.xml Adding comment about EC2
> ---
>
> Key: HBASE-4597
> URL: https://issues.apache.org/jira/browse/HBASE-4597
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: performance_HBASE_4597.xml.patch
>
>
> I added a section under performance reminding people that running HBase on 
> EC2 isn't the same thing as running on a dedicated server.
> This type of question seems to happen fairly often on the dist-list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4597) [book] performance.xml Adding comment about EC2

2011-10-15 Thread Doug Meil (Created) (JIRA)

[book] performance.xml Adding comment about EC2
---

 Key: HBASE-4597
 URL: https://issues.apache.org/jira/browse/HBASE-4597
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor


I added a section under performance reminding people that running HBase on EC2 
isn't the same thing as running on a dedicated server.

This type of question seems to happen fairly often on the dist-list.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4596) [book] chapter reordering

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4596:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [book] chapter reordering
> -
>
> Key: HBASE-4596
> URL: https://issues.apache.org/jira/browse/HBASE-4596
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4596.xml.patch
>
>
> Since the book grew organically things just kept getting added to the end, 
> whether or not it was the best place for it.
> The first 4 chapters stay the same, the change is aimed at the chapters after 
> "HBase Shell".  I'm pushing the conceptual material up front, keeping the 
> support chapters together, and keeping the Developing HBase at the end.  For 
> example, right after the book introduces the shell, BAM!  Write a MapReduce 
> program!  Even before you know how to create a table, or even what the 
> overall datamodel is.  Etc.
> Before...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> HBase and MapReduce
> HBase and Schema Design
> Metrics
> Cluster Replication
> Data Model
> Architecture
> Performance Tuning
> Troubleshooting
> Building HBase
> Developing HBase
> External APIs
> HBase Operational Mgt
> After...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> Data Model
> HBase and Schema Design
> HBase and MapReduce
> Architecture
> External APIs
> Performance Tuning
> Troubleshooting
> HBase Operational Mgt
> Building and Developing HBase
> (In another Jira this week, Cluster Replication was put under HBase 
> Operational Mgt, Metrics were put under HBase Operational Mgt, and Building 
> HBase was moved under Developing HBase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4596) [book] chapter reordering

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4596:
-

Status: Patch Available  (was: Open)

> [book] chapter reordering
> -
>
> Key: HBASE-4596
> URL: https://issues.apache.org/jira/browse/HBASE-4596
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4596.xml.patch
>
>
> Since the book grew organically things just kept getting added to the end, 
> whether or not it was the best place for it.
> The first 4 chapters stay the same, the change is aimed at the chapters after 
> "HBase Shell".  I'm pushing the conceptual material up front, keeping the 
> support chapters together, and keeping the Developing HBase at the end.  For 
> example, right after the book introduces the shell, BAM!  Write a MapReduce 
> program!  Even before you know how to create a table, or even what the 
> overall datamodel is.  Etc.
> Before...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> HBase and MapReduce
> HBase and Schema Design
> Metrics
> Cluster Replication
> Data Model
> Architecture
> Performance Tuning
> Troubleshooting
> Building HBase
> Developing HBase
> External APIs
> HBase Operational Mgt
> After...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> Data Model
> HBase and Schema Design
> HBase and MapReduce
> Architecture
> External APIs
> Performance Tuning
> Troubleshooting
> HBase Operational Mgt
> Building and Developing HBase
> (In another Jira this week, Cluster Replication was put under HBase 
> Operational Mgt, Metrics were put under HBase Operational Mgt, and Building 
> HBase was moved under Developing HBase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4596) [book] chapter reordering

2011-10-15 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4596:
-

Attachment: book_HBASE_4596.xml.patch

> [book] chapter reordering
> -
>
> Key: HBASE-4596
> URL: https://issues.apache.org/jira/browse/HBASE-4596
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4596.xml.patch
>
>
> Since the book grew organically things just kept getting added to the end, 
> whether or not it was the best place for it.
> The first 4 chapters stay the same, the change is aimed at the chapters after 
> "HBase Shell".  I'm pushing the conceptual material up front, keeping the 
> support chapters together, and keeping the Developing HBase at the end.  For 
> example, right after the book introduces the shell, BAM!  Write a MapReduce 
> program!  Even before you know how to create a table, or even what the 
> overall datamodel is.  Etc.
> Before...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> HBase and MapReduce
> HBase and Schema Design
> Metrics
> Cluster Replication
> Data Model
> Architecture
> Performance Tuning
> Troubleshooting
> Building HBase
> Developing HBase
> External APIs
> HBase Operational Mgt
> After...
> Getting started
> Configuration
> Upgrading
> HBase Shell
> Data Model
> HBase and Schema Design
> HBase and MapReduce
> Architecture
> External APIs
> Performance Tuning
> Troubleshooting
> HBase Operational Mgt
> Building and Developing HBase
> (In another Jira this week, Cluster Replication was put under HBase 
> Operational Mgt, Metrics were put under HBase Operational Mgt, and Building 
> HBase was moved under Developing HBase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4596) [book] chapter reordering

2011-10-15 Thread Doug Meil (Created) (JIRA)

[book] chapter reordering
-

 Key: HBASE-4596
 URL: https://issues.apache.org/jira/browse/HBASE-4596
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor


Since the book grew organically things just kept getting added to the end, 
whether or not it was the best place for it.

The first 4 chapters stay the same, the change is aimed at the chapters after 
"HBase Shell".  I'm pushing the conceptual material up front, keeping the 
support chapters together, and keeping the Developing HBase at the end.  For 
example, right after the book introduces the shell, BAM!  Write a MapReduce 
program!  Even before you know how to create a table, or even what the overall 
datamodel is.  Etc.

Before...

Getting started
Configuration
Upgrading
HBase Shell
HBase and MapReduce
HBase and Schema Design
Metrics
Cluster Replication
Data Model
Architecture
Performance Tuning
Troubleshooting
Building HBase
Developing HBase
External APIs
HBase Operational Mgt

After...

Getting started
Configuration
Upgrading
HBase Shell
Data Model
HBase and Schema Design
HBase and MapReduce
Architecture
External APIs
Performance Tuning
Troubleshooting
HBase Operational Mgt
Building and Developing HBase

(In another Jira this week, Cluster Replication was put under HBase Operational 
Mgt, Metrics were put under HBase Operational Mgt, and Building HBase was moved 
under Developing HBase)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128261#comment-13128261
 ] 

jirapos...@reviews.apache.org commented on HBASE-4532:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2393/
---

(Updated 2011-10-15 19:06:53.079380)


Review request for hbase, Dhruba Borthakur, Michael Stack, Mikhail Bautin, 
Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry 
Chen, Liyin, Karthik Ranganathan, and Nicolas Spiegelberg.


Summary
---

HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
enabled. 
This jira tries to avoid top row seek for all the cases by creating a dedicated 
bloom filter only for delete family

The only subtle use case is when we are interested in the top row with empty 
column.

For example, 
we are interested in row1/cf1:/1/put.
So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
bloom filter will say there is NO delete family.
Then it will avoid the top row seek and return a fake kv, which is the last kv 
for this row (createLastOnRowCol).
In this way, we have already missed the real kv we are interested in.

The solution for the above problem is to disable this optimization if we are 
trying to GET/SCAN a row with empty column.

This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as 
well. I will submit the patch for apache-trunk later.


This addresses bug HBASE-4532.
https://issues.apache.org/jira/browse/HBASE-4532


Diffs
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 
6cf7cce 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
a1d7de5 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
ebb360c 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 
  src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java 
c88b23f 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 
48e9163 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 

Diff: https://reviews.apache.org/r/2393/diff


Testing
---

Running all the unit tests now


Thanks,

Liyin



> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---
>
> Key: HBASE-4532
> URL: https://issues.apache.org/jira/browse/HBASE-4532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
> enabled. 
> This jira tries to avoid top row seek for all the cases by creating a 
> dedicated bloom filter only for delete family
> The only subtle use case is when we are interested in the top row with empty 
> column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
> bloom filter will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last 
> kv for this row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are 
> trying to GET/SCAN a row with empty column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128255#comment-13128255
 ] 

jirapos...@reviews.apache.org commented on HBASE-4532:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2393/
---

(Updated 2011-10-15 18:31:25.852446)


Review request for hbase, Dhruba Borthakur, Michael Stack, Pritam Damania, 
Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin, 
Karthik Ranganathan, and Nicolas Spiegelberg.


Summary (updated)
---

HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
enabled. 
This jira tries to avoid top row seek for all the cases by creating a dedicated 
bloom filter only for row with empty column.

Previous solution is to create the dedicated bloom filter for delete family, 
which does not work if there is a row with empty column.
For example, 
we are interested in row1/cf1:/1/put.
So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
bloom filter will say there is NO delete family.
Then it will avoid the top row seek and return a fake kv, which is the last kv 
for this row (createLastOnRowCol).
In this way, we have already missed the real kv we are interested in.

The root cause is that even there is no delete family at top row, we still 
cannot avoid the top row seek.
We can ONLY avoid the top row seek when there is no row with empty column, no 
matter what kind of kv type (delete/deleteCol/deleteFamily/put).
So the current solution is to create the dedicate bloom filter for row with 
empty column.

This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as 
well. I will submit the patch for apache-trunk later.


This addresses bug HBASE-4532.
https://issues.apache.org/jira/browse/HBASE-4532


Diffs
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 
6cf7cce 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
a1d7de5 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
ebb360c 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 
  src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java 
c88b23f 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 
48e9163 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 

Diff: https://reviews.apache.org/r/2393/diff


Testing
---

Running all the unit tests now


Thanks,

Liyin



> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---
>
> Key: HBASE-4532
> URL: https://issues.apache.org/jira/browse/HBASE-4532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
> enabled. 
> This jira tries to avoid top row seek for all the cases by creating a 
> dedicated bloom filter only for delete family
> The only subtle use case is when we are interested in the top row with empty 
> column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
> bloom filter will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last 
> kv for this row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are 
> trying to GET/SCAN a row with empty column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128256#comment-13128256
 ] 

jirapos...@reviews.apache.org commented on HBASE-4532:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2393/
---

(Updated 2011-10-15 18:33:14.858552)


Review request for hbase, Dhruba Borthakur, Michael Stack, Pritam Damania, 
Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin, 
Karthik Ranganathan, and Nicolas Spiegelberg.


Summary (updated)
---

HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
enabled. 
This jira tries to avoid top row seek for all the cases by creating a dedicated 
bloom filter only for delete family

The only subtle use case is when we are interested in the top row with empty 
column.

For example, 
we are interested in row1/cf1:/1/put.
So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
bloom filter will say there is NO delete family.
Then it will avoid the top row seek and return a fake kv, which is the last kv 
for this row (createLastOnRowCol).
In this way, we have already missed the real kv we are interested in.

The solution for the above problem is to disable this optimization if we are 
trying to GET/SCAN a row with empty column.

This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as 
well. I will submit the patch for apache-trunk later.


This addresses bug HBASE-4532.
https://issues.apache.org/jira/browse/HBASE-4532


Diffs
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 
6cf7cce 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
a1d7de5 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
ebb360c 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 
  src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java 
c88b23f 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 
48e9163 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 

Diff: https://reviews.apache.org/r/2393/diff


Testing
---

Running all the unit tests now


Thanks,

Liyin



> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---
>
> Key: HBASE-4532
> URL: https://issues.apache.org/jira/browse/HBASE-4532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
> enabled. 
> This jira tries to avoid top row seek for all the cases by creating a 
> dedicated bloom filter only for delete family
> The only subtle use case is when we are interested in the top row with empty 
> column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
> bloom filter will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last 
> kv for this row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are 
> trying to GET/SCAN a row with empty column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128253#comment-13128253
 ] 

jirapos...@reviews.apache.org commented on HBASE-4532:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2393/
---

Review request for hbase, Dhruba Borthakur, Michael Stack, Pritam Damania, 
Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Liyin, 
Karthik Ranganathan, and Nicolas Spiegelberg.


Summary
---

HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
enabled. 
This jira tries to avoid top row seek for all the cases by creating a dedicated 
bloom filter only for row with empty column.

Previous solution is to create the dedicated bloom filter for delete family, 
which does not work if there is a row with empty column.
For example, 
we are interested in row1/cf1:/1/put.
So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
bloom filter will say there is NO delete family.
Then it will avoid the top row seek and return a fake kv, which is the last kv 
for this row (createLastOnRowCol).
In this way, we have already missed the real kv we are interested in.

The root cause is that even there is no delete family at top row, we still 
cannot avoid the top row seek.
We can ONLY avoid the top row seek when there is no row with empty column, no 
matter what kind of kv type (delete/deleteCol/deleteFamily/put).
So the current solution is to create the dedicate bloom filter for row with 
empty column.

This patch is rebased on 0.89-fb. But it should be the same for apache-trunk as 
well. I will submit the patch for apache-trunk later.


This addresses bug HBASE-4532.
https://issues.apache.org/jira/browse/HBASE-4532


Diffs
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 93538bb 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 9a79a74 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5d9b518 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 
6cf7cce 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1f78dd4 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 3c34f86 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 2e1d23a 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java c4b60e9 
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
a1d7de5 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java e4dfc2e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
ebb360c 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 8814812 
  src/main/java/org/apache/hadoop/hbase/util/BloomFilterFactory.java fb4f2df 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java 
c88b23f 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 
48e9163 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 0eca9b8 

Diff: https://reviews.apache.org/r/2393/diff


Testing
---

Running all the unit tests now


Thanks,

Liyin



> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---
>
> Key: HBASE-4532
> URL: https://issues.apache.org/jira/browse/HBASE-4532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> HBASE-4469 avoids the top row seek operation if row-col bloom filter is 
> enabled. 
> This jira tries to avoid top row seek for all the cases by creating a 
> dedicated bloom filter only for delete family
> The only subtle use case is when we are interested in the top row with empty 
> column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
> bloom filter will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last 
> kv for this row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are 
> trying to GET/SCAN a row with empty column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-15 Thread Ted Yu (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4489:
--

Fix Version/s: 0.90.5

> Better key splitting in RegionSplitter
> --
>
> Key: HBASE-4489
> URL: https://issues.apache.org/jira/browse/HBASE-4489
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Dave Revell
>Assignee: Dave Revell
> Fix For: 0.90.5
>
> Attachments: HBASE-4489-branch0.90-v1.patch, 
> HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
> HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
> HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
> HBASE-4489-trunk-v5.patch
>
>
> The RegionSplitter utility allows users to create a pre-split table from the 
> command line or do a rolling split on an existing table. It supports 
> pluggable split algorithms that implement the SplitAlgorithm interface. The 
> only/default SplitAlgorithm is one that assumes keys fall in the range from 
> ASCII string "" to ASCII string "7FFF". This is not a sane 
> default, and seems useless to most users. Users are likely to be surprised by 
> the fact that all the region splits occur in in the byte range of ASCII 
> characters.
> A better default split algorithm would be one that evenly divides the space 
> of all bytes, which is what this patch does. Making a table with five regions 
> would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
> \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-15 Thread Matteo Bertozzi (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128224#comment-13128224
 ] 

Matteo Bertozzi commented on HBASE-3929:


>From 0.92 the HFile.main() contains just a call to HFilePrettyPrinter.run()
So there's no more the "Tool" code inside the HFile.java

Probably was not the refactor that todd has in mind, but it solve the first 
todd's thought:
'we should refactor all of the HFile "Tool" stuff out of HFile into a new 
class.'

> Add option to HFile tool to produce basic stats
> ---
>
> Key: HBASE-3929
> URL: https://issues.apache.org/jira/browse/HBASE-3929
> Project: HBase
>  Issue Type: New Feature
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.94.0
>
> Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
> some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128208#comment-13128208
 ] 

Ted Yu commented on HBASE-3929:
---

@Matteo:
Are you going to perform the refactoring Todd mentioned ?

Thanks

> Add option to HFile tool to produce basic stats
> ---
>
> Key: HBASE-3929
> URL: https://issues.apache.org/jira/browse/HBASE-3929
> Project: HBase
>  Issue Type: New Feature
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.94.0
>
> Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
> some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Ted Yu (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4595:
-

Assignee: Matteo Bertozzi

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128205#comment-13128205
 ] 

Ted Yu commented on HBASE-4595:
---

+1 on patch.

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Matteo Bertozzi (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-4595:
---

Status: Patch Available  (was: Open)

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Matteo Bertozzi (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-4595:
---

Attachment: HBASE-4595.patch

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-15 Thread Matteo Bertozzi (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-3929:
---

Attachment: hbase-3929-draft.patch

patch updated for hbase trunk

> Add option to HFile tool to produce basic stats
> ---
>
> Key: HBASE-3929
> URL: https://issues.apache.org/jira/browse/HBASE-3929
> Project: HBase
>  Issue Type: New Feature
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.94.0
>
> Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
> some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Matteo Bertozzi (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-4595:
---

Attachment: (was: HBASE-4595.patch)

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Matteo Bertozzi (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-4595:
---

Attachment: HBASE-4595.patch

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Matteo Bertozzi (Created) (JIRA)

HFilePrettyPrinter Scanned kv count always 0


 Key: HBASE-4595
 URL: https://issues.apache.org/jira/browse/HBASE-4595
 Project: HBase
  Issue Type: Bug
  Components: io
Affects Versions: 0.92.0, 0.94.0, 0.92.1
Reporter: Matteo Bertozzi
Priority: Minor


The "count" variable used to print the "Scanned kv count" is never incremented.
A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128148#comment-13128148
 ] 

Ted Yu commented on HBASE-4536:
---

bq. I was also thinking about an enum for the scan type too. There are seven 
different cases:
Since the cases span multiple dimensions (compaction, keeping deleted rows), 
enum is not good candidate.

How about introducing MatcherConfig so that new parameters can be added 
relatively easily in the future ?

> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128146#comment-13128146
 ] 

Ted Yu commented on HBASE-4536:
---

Continuing discussion started 11/Oct/11 06:45, I think the following check at 
line 321:
{code}
  if (minVersions >= maxVersions) {
{code}
should at least be lifted above the check at line 318:
{code}
  if (timeToLive == HConstants.FOREVER) {
{code}

> Allow CF to retain deleted rows
> ---
>
> Key: HBASE-4536
> URL: https://issues.apache.org/jira/browse/HBASE-4536
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers

2011-10-15 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128141#comment-13128141
 ] 

Ted Yu commented on HBASE-4588:
---

Nice work.
{code}
+" hbase.regionserver.global.memstore.upperLimit=" + gml +
+" hfile.block.cache.size " + bcml);
{code}
I think the original values should be printed above.

Minor: 100 should be defined as a (local) constant and referenced in 
checkForClusterFreeMemoryLimit().
Following the abbreviation example for gml, bcml may be called bcul.

> The floating point arithmetic to validate memory allocation configurations 
> need to be done as integers
> --
>
> Key: HBASE-4588
> URL: https://issues.apache.org/jira/browse/HBASE-4588
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: dhruba borthakur
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: configVerify1.txt
>
>
> The floating point arithmetic to validate memory allocation configurations 
> need to be done as integers.
> On our cluster, we had block cache = 0.6 and memstore = 0.2.  It was saying 
> this was > 0.8 when it is actually equal.
> Minor bug but annoying nonetheless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128140#comment-13128140
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2611
---

We're closer.
Thanks for the perseverance, Dhruba.

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

We know that w != null here, so w.getWriteNumber() should be passed to 
rollbackMemstore().

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

We should have memstoreTS parameter here.

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

Typo: succeeded

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

Interesting, for next round of discussions.
I think we should take more preventive action here.

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

Should read 'keyvals from memstore whose timestamp matches'

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

our memstoreTS should be checked against that of kv's.

/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

I think this should be in a finally block corresponding to the try at line 
2205.

- Ted

On 2011-10-15 07:32:28, Dhruba Borthakur wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2141/
bq.  ---
bq.  
bq.  (Updated 2011-10-15 07:32:28)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  The changes the multiPut operation so that the sync to the wal occurs 
outside the rowlock.
bq.  
bq.  This enhancement is done only to HRegion.mut(Put[]) because this is the 
only method that gets invoked from an application. The HRegion.put(Put) is used 
only by unit tests and should possibly be deprecated.
bq.  
bq.  
bq.  This addresses bug HBASE-4528.
bq.  https://issues.apache.org/jira/browse/HBASE-4528
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183585 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1183585 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 
1183585 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1183585 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183585 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 
1183585 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 
1183585 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1183585 
bq.  
bq.  Diff: https://reviews.apache.org/r/2141/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I ran TestLogRolling over and over again, about 50 times, not failed a 
single time.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Dhruba
bq.  
bq.

> The put operation can release the rowlock before sync-ing the Hlog
> --
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
> appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128100#comment-13128100
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/
---

(Updated 2011-10-15 07:32:28.011678)


Review request for hbase.


Summary
---

The changes the multiPut operation so that the sync to the wal occurs outside 
the rowlock.

This enhancement is done only to HRegion.mut(Put[]) because this is the only 
method that gets invoked from an application. The HRegion.put(Put) is used only 
by unit tests and should possibly be deprecated.


This addresses bug HBASE-4528.
https://issues.apache.org/jira/browse/HBASE-4528


Diffs
-

  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183585 
  
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1183585 

Diff: https://reviews.apache.org/r/2141/diff


Testing (updated)
---

I ran TestLogRolling over and over again, about 50 times, not failed a single 
time.


Thanks,

Dhruba



> The put operation can release the rowlock before sync-ing the Hlog
> --
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
> appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128098#comment-13128098
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/
---

(Updated 2011-10-15 07:31:28.693947)


Review request for hbase.


Summary (updated)
---

The changes the multiPut operation so that the sync to the wal occurs outside 
the rowlock.

This enhancement is done only to HRegion.mut(Put[]) because this is the only 
method that gets invoked from an application. The HRegion.put(Put) is used only 
by unit tests and should possibly be deprecated.


This addresses bug HBASE-4528.
https://issues.apache.org/jira/browse/HBASE-4528


Diffs
-

  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183585 
  
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1183585 

Diff: https://reviews.apache.org/r/2141/diff


Testing
---

Not yet run the full suite of unit tests.


Thanks,

Dhruba



> The put operation can release the rowlock before sync-ing the Hlog
> --
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
> appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128097#comment-13128097
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/
---

(Updated 2011-10-15 07:31:06.389199)


Review request for hbase.


Changes
---

Addressed Kannans, ted and Gary review comments. Changed name of method to 
rollbackMemstore. And the rollback method now compare memstoreTS before 
deleting the key.


Summary
---

The changes the multiPut operation so that the sync to the wal occurs outside 
the rowlock.

This enhancement is done only to HRegion.mut(Put[]) because this is the only 
method that gets invoked from an application. The HRegion.put(Put) is used only 
by unit tests and should possibly be deprecated.

I have attached a unit test. I have not yet run all unit tests, but early 
feedback on this patch will be very helpful.


This addresses bug HBASE-4528.
https://issues.apache.org/jira/browse/HBASE-4528


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183585 
  
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183585 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
  /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1183585 
  /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1183585 

Diff: https://reviews.apache.org/r/2141/diff


Testing
---

Not yet run the full suite of unit tests.


Thanks,

Dhruba



> The put operation can release the rowlock before sync-ing the Hlog
> --
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
> appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread dhruba borthakur (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSync5.txt

Addressed Kannans, ted and Gary review comments. Changed name of method to 
rollbackMemstore. And the rollback method now compare memstoreTS before 
deleting the key. 

> The put operation can release the rowlock before sync-ing the Hlog
> --
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
> appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

51 matches

Mail list logo