[ 
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091222#comment-15091222
 ] 

James Taylor commented on PHOENIX-2446:
---------------------------------------

Yes, the scn of the index population would be after the ts of the UPSERT 
SELECT, but the UPSERT SELECT may still be running when the CREATE INDEX is 
issued. That's the root of the problem, though - we may not have all the table 
rows that are earlier than the initial index population ts. The fix I propose 
will essentially turn on incremental index population for in-flight statements 
that may be in the process of adding data table rows when the CREATE INDEX 
statement is issued.

One more small change will be necessary that I didn't mention before too. When 
auto commit is on, we call updateCache for UPSERT VALUES from FromCompiler here:
{code}
        protected TableRef createTableRef(NamedTableNode tableNode, boolean 
updateCacheImmediately) throws SQLException {
            String tableName = tableNode.getName().getTableName();
            String schemaName = tableNode.getName().getSchemaName();
            long timeStamp = QueryConstants.UNSET_TIMESTAMP;
            String fullTableName = SchemaUtil.getTableName(schemaName, 
tableName);
            PName tenantId = connection.getTenantId();
            PTable theTable = null;
            if (updateCacheImmediately || connection.getAutoCommit()) {
                MetaDataMutationResult result = client.updateCache(schemaName, 
tableName);
{code}
Instead, we should always let the updateCache call be done in 
MutationState.validate(), so the following if statement should be removed:
{code}
        private long validate(TableRef tableRef, Map<ImmutableBytesPtr, 
RowMutationState> rowKeyToColumnMap) throws SQLException {
            Long scn = connection.getSCN();
            MetaDataClient client = new MetaDataClient(connection);
            long serverTimeStamp = tableRef.getTimeStamp();
            // If we're auto committing, we've already validated the schema 
when we got the ColumnResolver,
            // so no need to do it again here.
            if (!connection.getAutoCommit()) {
{code}

> Immutable index - Index vs base table row count does not match when index is 
> created during data load
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2446
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2446
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2446.patch
>
>
> I'll add more details later but here's the scenario that consistently 
> produces wrong row count for index table vs base table for immutable async 
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R 
> index finishes.
> 5. End data upsert. 
> Now count with index enabled vs count with hint to not use index is off by a 
> large factor. Will get a cleaner repro for this issue soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to