[
https://issues.apache.org/jira/browse/PHOENIX-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Geoffrey Jacoby updated PHOENIX-4984:
-------------------------------------
Summary: PhoenixMRJobSubmitter doesn't prevent rebuild jobs from being
started multiple times (was: PhoenixMRJobSubmitter does not get the correct
list of jobs to submit for IndexTool jobs)
> PhoenixMRJobSubmitter doesn't prevent rebuild jobs from being started
> multiple times
> ------------------------------------------------------------------------------------
>
> Key: PHOENIX-4984
> URL: https://issues.apache.org/jira/browse/PHOENIX-4984
> Project: Phoenix
> Issue Type: Bug
> Reporter: Chinmay Kulkarni
> Assignee: Geoffrey Jacoby
> Priority: Major
>
> When trying to get the list of jobs to submit, we get the already scheduled
> jobs list from the Yarn Resource Manager and exclude those jobs inside
> {{PhoenixMRJobSubmitter#getJobsToSubmit}}, however a naming format difference
> prevents correctly removing already running/submitted jobs.
> In {{IndexTool.java}}, we use the following convention for naming the M/R job:
> INDEX_JOB_NAME_TEMPLATE = "PHOENIX_<schema name>*.*<data table
> name>_INDX_<index name>";
> However, I see the following log lines for candidate jobs:
> _Candidate Indexes to be built as seen from SYSTEM.CATALOG - PHOENIX_<data
> table name>_INDX_<index name> ... _
> And the following for already submitted jobs as got from Yarn:
> _Already Submitted/Running MR index build jobs - [PHOENIX_<schema name>.<data
> table name>_INDX_<index name>]_
> Due to this naming conflict (no '.'), even though an index build M/R job is
> running for a given index, this is not detected correctly and another one can
> be started for the same index. This can lead to unnecessary load on the
> region servers hosting regions for the index
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)