[ 
https://issues.apache.org/jira/browse/PHOENIX-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669446#comment-16669446
 ] 

Geoffrey Jacoby commented on PHOENIX-4703:
------------------------------------------

What should the syntax for this new rebuild mode be? It seems tough to do this 
in a way that neither changes current behavior for existing users, nor would 
confuse a new user. 

As the JIRA says, the current behavior is:
ALTER INDEX ... REBUILD -> Drop index data, rebuild data synchronously 
ALTER INDEX ... REBUILD ASYNC -> Partial rebuild data from 
ASYNC_REBUILD_TIMESTAMP

I could add a flag "ALL", so that we have a third option:
ALTER INDEX ... REBUILD ALL ASYNC -> Drop index data, rebuild data 
_asynchronously_

But then how do we interpret ALTER INDEX .. REBUILD ALL with no ASYNC? If we 
throw an exception as invalid syntax, that seems confusing. But if we don't, 
and allow an ALTER INDEX .. REBUILD ALL with no ASYNC, then it's confusing that 
it _still_ rebuilds everything without the ALL as well. 

Given the weird mismatch in the original behavior, I'm not sure how to make it 
all make sense now. :-)
 
[~vincentpoon] [~kozdemir] [~vishk] [~tdsilva]

 

> Provide an option to fully rebuild indexes asynchronously through SQL
> ---------------------------------------------------------------------
>
>                 Key: PHOENIX-4703
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4703
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Vincent Poon
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>
> Currently if we run "ALTER INDEX ... REBUILD" , all the rows in the index are 
> deleted and the index is rebuilt synchronously.
> "ALTER INEX ... REBUILD ASYNC" seems to be used for the IndexTool's partial 
> rebuild option, rebuilding from ASYNC_REBUILD_TIMESTAMP (PHOENIX-2890)
> So it seems currently the only way to fully rebuild is the drop the index, 
> and recreate it.  This is burdensome as it requires have the schema DDL.
> We should have an option to fully rebuild asynchronously, that has the same 
> semantics as dropping and recreating the index.  A further advantage of this 
> is we can maintain the splits of the index table while dropping its data.  We 
> are currently seeing issues where rebuilding a large table via a MR job 
> results in hotspotting due to all data regions writing to the same index 
> region at the start.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to