[jira] Updated: (HBASE-613) Timestamp-anchored scanning fails to find all records

Jim Kellerman (JIRA) Thu, 19 Jun 2008 19:00:37 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jim Kellerman updated HBASE-613:
--------------------------------

    Attachment: 613.patch

HAbstractScanner
- remove HAbstactScanner.iterator() - iterator is not a method on 
InternalScanner

HRegion
- make getScanner more efficient by iterating only once to find the stores we 
need to scan
- only pass columns relevant to a store to a HStoreScanner
- remove HScanner.iterator() - iterator is not a method on InternalScanner

MemcacheScanner
- never return HConstants.LATEST_TIMESTAMP as the timestamp value for a row. 
Instead use the largest timestamp from the cells being returned. This allows a 
scanner to determine a timestamp that can be used to fetch the same data again 
should new versions be inserted later.

StoreFileScanner
- getNextViableRow would find a row that matched the row key, but did not 
consider the requested timestamp. Now if the row it finds has a timestamp 
greater than the one desired it advances to determine if a row with a timestamp 
less than or equal to the requested one exists since timestamps are sorted 
descending.
- removed an unnecessary else

Timestamp
- The program that was used to find the problem and test the fix.

TestScanMultipleVersions
- Test program that fails on current trunk but passes when this patch is 
applied.

NOTE: TestHRegionServerExit failed on both Windows and Linux, but 
TestRegionRebalancing passed on Linux and failed on Windows.

All other tests passed, and when I ran TestScanMultipleVersions against 
unpatched trunk, it failed.

Please review.


> Timestamp-anchored scanning fails to find all records
> -----------------------------------------------------
>
>                 Key: HBASE-613
>                 URL: https://issues.apache.org/jira/browse/HBASE-613
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 613.patch, nogood.patch, TestTimestampScanning.java, 
> Timestamp.patch
>
>
> If I add 3 versions of a cell and then scan across the first set of added 
> cells using a timestamp that should only get values from the first upload, a 
> bunch are missing (I added 100k on each of the three uploads).  I thought it 
> the fact that we set the number of cells found back to 1 in HStore when we 
> move off current row/column but that doesn't seem to be it.  I also tried 
> upping the MAX_VERSIONs on my table and that seemed to have no effect.  Need 
> to look closer.
> Build a unit test because replicating on cluster takes too much time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-613) Timestamp-anchored scanning fails to find all records

Reply via email to