[
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098964#comment-15098964
]
Thomas D'Silva commented on PHOENIX-2446:
-----------------------------------------
I don't think a flush will guarantee that pending batches will be put in the
memstore, I think it just flushes whatever is in the memstore. When I tested
out using a flush before running the UPSERT SELECT depending on how long the
flush takes the UPSERT SELECT is able to see the rows. If I initiate the index
build after a few batch of rows (<5000) are sent, the flush completes quickly
and the UPSERT SELECT is not able to see the rows. If I kick off the create
index after ~300000 rows are sent then the flush takes longer and the UPSERT
SELECT is able to see the rows.
A one second sleep solves the issue when testing manually.
I can't think of a way we can detect all batches of rows that were sent before
the incremental index maintenance is enabled is in the memstore before we run
the index population UPSERT SELECT.
> Immutable index - Index vs base table row count does not match when index is
> created during data load
> -----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2446
> URL: https://issues.apache.org/jira/browse/PHOENIX-2446
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.6.0
> Reporter: Mujtaba Chohan
> Assignee: Thomas D'Silva
> Fix For: 4.7.0
>
> Attachments: PHOENIX-2446-wip.patch, PHOENIX-2446.patch, server.log
>
>
> I'll add more details later but here's the scenario that consistently
> produces wrong row count for index table vs base table for immutable async
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R
> index finishes.
> 5. End data upsert.
> Now count with index enabled vs count with hint to not use index is off by a
> large factor. Will get a cleaner repro for this issue soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)