[jira] Commented: (HBASE-2517) During reads when passed the specified time range, seek to next column

HBase Review Board (JIRA) Thu, 15 Jul 2010 01:50:21 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888750#action_12888750
 ]


HBase Review Board commented on HBASE-2517:
-------------------------------------------

Message from: "Pranav Khaitan" <[email protected]>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/323/
-----------------------------------------------------------

Review request for hbase, Jonathan Gray, Karthik Ranganathan, and Kannan 
Muthukkaruppan.


Summary
-------

This patch addresses the following issues:

1. After it is done with reading the required timestamps, the QueryMatcher 
should return a NEXT_COL so that it doesn't keep on reading every kv till the 
end of the column. 

2. Before returning NEXT_COL, it also checks if any further columns are 
required. If no columns are required, then it returns NEXT_ROW instead of 
returning NEXT_COL. This saves significant time and another round of iteration.

3. Before seeking to NEXT_ROW, we check if we are already on the last row. If 
we are on the last row, then we can return false. This avoids one more call to 
next() and saves times.

4. Provides useful input for HBase-2450 and HBase-1517 which can take advantage 
of these return codes.

5. Optimizes Get queries with only one column.

6. Fixing a bug which occurred when versions were processed before filters were 
applied.

7. If we know (using filters/timestamps) that we don't need any more keys for a 
particular column, then there should be a mechanism to send this information to 
ExplicitColumnTracker.


This addresses bug HBASE-2517.
    http://issues.apache.org/jira/browse/HBASE-2517


Diffs
-----

  trunk/src/main/java/org/apache/hadoop/hbase/io/TimeRange.java 963961 
  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 963961 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java 
963961 
  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
963961 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
963961 
  
trunk/src/test/java/org/apache/hadoop/hbase/client/TestMultipleTimestamps.java 
963961 

Diff: http://review.hbase.org/r/323/diff


Testing
-------

Existing tests run successfully with some of them going through the modified 
code path. Added specialized unit tests for this purpose. Did manual debugging 
to see if the optimization is being done and correct match codes are being 
returned. 


Thanks,

Pranav




> During reads when passed the specified time range, seek to next column
> ----------------------------------------------------------------------
>
>                 Key: HBASE-2517
>                 URL: https://issues.apache.org/jira/browse/HBASE-2517
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Jonathan Gray
>            Assignee: Pranav Khaitan
>             Fix For: 0.90.0
>
>
> When we are processing the stream of KeyValues in the ScanQueryMatcher, we 
> will check the timestamp of the current KV against the specific TimeRange.  
> Currently we only check if it is in the range or not, returning SKIP if 
> outside the range or continuing to other checks if within the range.
> The check should actually return SKIP if the stamp is greater than the 
> TimeRange and NEXT_COL if the stamp is less than the TimeRange (we know we 
> won't take anymore columns from the current column once we are below the 
> TimeRange).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2517) During reads when passed the specified time range, seek to next column

Reply via email to