[ 
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098964#comment-15098964
 ] 

Thomas D'Silva commented on PHOENIX-2446:
-----------------------------------------

I don't think a flush will guarantee that pending batches will be put in the 
memstore, I think it just flushes whatever is in the memstore. When I tested 
out using a flush before running the UPSERT SELECT depending on how long the 
flush takes the UPSERT SELECT is able to see the rows. If I initiate the index 
build after a few batch of rows (<5000) are sent, the flush completes quickly 
and the UPSERT SELECT is not able to see the rows. If I kick off the create 
index after ~300000 rows are sent then the flush takes longer and the UPSERT 
SELECT is able to see the rows. 
A one second sleep solves the issue when testing manually. 
I can't think of a way we can detect all batches of rows that were sent before 
the incremental index maintenance  is enabled is in the memstore before we run 
the index population UPSERT SELECT.

> Immutable index - Index vs base table row count does not match when index is 
> created during data load
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2446
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2446
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2446-wip.patch, PHOENIX-2446.patch, server.log
>
>
> I'll add more details later but here's the scenario that consistently 
> produces wrong row count for index table vs base table for immutable async 
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R 
> index finishes.
> 5. End data upsert. 
> Now count with index enabled vs count with hint to not use index is off by a 
> large factor. Will get a cleaner repro for this issue soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to