[ 
https://issues.apache.org/jira/browse/PHOENIX-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669477#comment-16669477
 ] 

Vincent Poon commented on PHOENIX-4703:
---------------------------------------

Hmm yea it's unfortunate the grammar for partial rebuild used the REBUILD 
command.  So too late to change that now...
We need to update the docs for how REBUILD ASYNC works.  I'm still not exactly 
sure myself.  Glancing through the code, it seems it takes the index table's 
current timestamp.
So I think it's supposed to be used in conjunction with ALTER INDEX DISABLE.  
Where you disable first, and then you run REBUILD ASYNC which rebuilds from the 
"current timestamp" which is when the index was disabled?

Anyhow, given this legacy, one (imperfect) option is to parameterize:
ALTER INDEX...REBUILD ASYNC(timestamp)
The default for timestamp would be the same as it is today (table's timestamp).
If you pass 0 for timestamp, then we know we have to rebuild from the beginning 
of time, so we truncate the table first, and set the index state to 'b'
WDYT [~gjacoby]?

I think your idea of ALL is fine as well, despite the redundancy with plain 
"alter index...REBUILD"

> Provide an option to fully rebuild indexes asynchronously through SQL
> ---------------------------------------------------------------------
>
>                 Key: PHOENIX-4703
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4703
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Vincent Poon
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>
> Currently if we run "ALTER INDEX ... REBUILD" , all the rows in the index are 
> deleted and the index is rebuilt synchronously.
> "ALTER INEX ... REBUILD ASYNC" seems to be used for the IndexTool's partial 
> rebuild option, rebuilding from ASYNC_REBUILD_TIMESTAMP (PHOENIX-2890)
> So it seems currently the only way to fully rebuild is the drop the index, 
> and recreate it.  This is burdensome as it requires have the schema DDL.
> We should have an option to fully rebuild asynchronously, that has the same 
> semantics as dropping and recreating the index.  A further advantage of this 
> is we can maintain the splits of the index table while dropping its data.  We 
> are currently seeing issues where rebuilding a large table via a MR job 
> results in hotspotting due to all data regions writing to the same index 
> region at the start.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to