-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/574/#review900
-----------------------------------------------------------

Ship it!


ok this good, can you file a JIRA for the other optimization i mentioned?

- Ryan


On 2010-08-13 12:33:23, Pranav Khaitan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/574/
> -----------------------------------------------------------
> 
> (Updated 2010-08-13 12:33:23)
> 
> 
> Review request for hbase, stack, Jonathan Gray, Ryan Rawson, Karthik 
> Ranganathan, and Kannan Muthukkaruppan.
> 
> 
> Summary
> -------
> 
> What this patch includes:
> 1. Reseek framework. The ability to reseek to any position after having 
> seeked to some point in the file. To add this utility, changes were required 
> in all scanners.
> 2. The option for any filter to be able to tell the scanner which  key it 
> wants to go to next. Filters can be easily customized for different use-cases 
> without affecting the main read path. Since filters are optional, they do not 
> add any overhead for users who do not take advantage of it.
> 3. ColumnPrefixFilter: This filter serves the purpose of selecting keys with 
> columns having a specified prefix. The filter takes advantage of theability 
> to pass keys to the scanner to tell which key it should seek to next.
> 4. This also gives the option to seek directly to the required columns using 
> reseek mechanism (HBASE-2450). However, it needs to be decided if that 
> feature should be made optional using a filter or should it be added to the 
> read path to be used by everyone. Did not include it in this patch since it 
> required further discussions and testing.
> 5. Small changes to ScanQueryMatcher to return more specific return codes.
> 
> For HFile and reseek, the modifications were done after discussions with Ryan 
> and he had also written some code for this patch. For ScanQueryMatcher and 
> Filters, discussions were held with Jonathan, Karthik and Kannan.
> 
> This is big as it touches 21 files. It is important to closely review the 
> reseek functions in HFile, StoreFileScanner, KeyValueHeap and 
> HalfStoreFileReader as these functions are slightly tricky and probably going 
> to be used in a lot of improvements in future.
> 
> 
> This addresses bugs HBASE-1517, HBASE-2903 and HBASE-2904.
>     http://issues.apache.org/jira/browse/HBASE-1517
>     http://issues.apache.org/jira/browse/HBASE-2903
>     http://issues.apache.org/jira/browse/HBASE-2904
> 
> 
> Diffs
> -----
> 
>   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java 
> PRE-CREATION 
>   trunk/src/main/java/org/apache/hadoop/hbase/filter/Filter.java 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 
> 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 
> 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java 
> 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java 
> 983321 
>   
> trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java 
> 983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 
> 983321 
>   
> trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MinorCompactingStoreScanner.java
>  983321 
>   
> trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
>  983321 
>   
> trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
>  983321 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
> 983321 
>   
> trunk/src/test/java/org/apache/hadoop/hbase/filter/TestColumnPrefixFilter.java
>  PRE-CREATION 
>   trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 
> PRE-CREATION 
>   
> trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
>  983321 
>   
> trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java
>  983321 
>   
> trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
>  983321 
> 
> Diff: http://review.cloudera.org/r/574/diff
> 
> 
> Testing
> -------
> 
> Added tests at HFileScanner and Filter/RegionScanner levels. The time taken 
> for running these tests is very less. All existing tests pass successfully. 
> Performance benchmarking was done and significant gains in performance can be 
> seen for corresponding use-cases.
> 
> 
> Thanks,
> 
> Pranav
> 
>

Reply via email to