[ 
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102990#comment-15102990
 ] 

Lars Hofhansl commented on PHOENIX-2446:
----------------------------------------

Chatted with James a bit. So to recap, the problem is the HBase MVCC 
transaction in flight before the index created; those would be missed since 
they did not exist when the index was created and are also not yet seen by a 
more or less parallel upsert select statement, right?

A flush won't help. 

The only thing that I can see would help is to await at least one MVCC 
transaction on all region servers. That is annoying, but doable.

i.e. calling  {{mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());}} that 
will force all prior MVCC transactions - if any - to return.
{{mvcc}} get be retrieved from the region interface with the getMVCC() method.

(As an aside, I'd add that upsert select won't scale to the kind of data size 
where HBase/Phoenix would actually be interesting. For less than maybe a few 
100m rows, one should use Postgres or equivalent.)


> Immutable index - Index vs base table row count does not match when index is 
> created during data load
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2446
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2446
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2446-wip.patch, PHOENIX-2446.patch, server.log
>
>
> I'll add more details later but here's the scenario that consistently 
> produces wrong row count for index table vs base table for immutable async 
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R 
> index finishes.
> 5. End data upsert. 
> Now count with index enabled vs count with hint to not use index is off by a 
> large factor. Will get a cleaner repro for this issue soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to