[ 
https://issues.apache.org/jira/browse/PHOENIX-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby reassigned PHOENIX-5027:
----------------------------------------

    Assignee: Kadir OZDEMIR  (was: Geoffrey Jacoby)

> PhoenixIndexImportDirectMapper retried mappers can succeed without inserting 
> all index data
> -------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5027
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5027
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Geoffrey Jacoby
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>
> On two recent occasions I've rebuilt a large global immutable index by doing 
> a DROP/CREATE and ended up with missing index data, though it doesn't happen 
> every time. Here's what happened:
> 1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, 
> and invokes IndexTool.
> 2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
> 3. Some mappers fail because of timeouts due to heavy splitting on the new 
> index table
> 4. Those mappers are retried and succeed. The MR job as a whole completes 
> successfully.
> 5. RowCounter and IndexScrutinyTool show millions of rows are missing from 
> the index, with keys that imply they were part of the failed mappers
> Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in 
> PhoenixIndexImportDirectMapper _looks_ idempotent on a rerun, so I've been 
> struggling to find the cause of the missing index data. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to