[
https://issues.apache.org/jira/browse/PHOENIX-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753790#comment-16753790
]
Karan Mehta commented on PHOENIX-5091:
--------------------------------------
In the current approach, the UpdateStatisticsTool manages snapshot on its own.
This includes creation and deletion. Thus the MR job runs in foreground. There
is no cleanup of older snapshots in case of failures.
Another approach as [~dbwong], is to delegate the delete snapshot method to
Hadoop MR job. A single reducer can be configured to delete the snapshot at the
end of the job. This can potentially allow us for launching asynchronous jobs.
However for this Jira and as per offline discussion, we have decided to use the
first option.
> Add new features to UpdateStatisticsTool
> ----------------------------------------
>
> Key: PHOENIX-5091
> URL: https://issues.apache.org/jira/browse/PHOENIX-5091
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Karan Mehta
> Assignee: Karan Mehta
> Priority: Major
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> {{UpdateStatisticsTool}} can be enhanced with following features to improve
> its overall ease of use.
> # Automatically take snapshot of the table before running the command (Add a
> new switch for this)
> # Automatically run tool for all relevant indexes (global and view indexes)
> # Do not rerun the job if it has ran recently (Based on some time threshold,
> LAST_STATS_UPDATED_TIME)
> Features will be added after PHOENIX-4009 is completed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)