[ 
https://issues.apache.org/jira/browse/PHOENIX-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-4984:
-------------------------------------
    Summary: PhoenixMRJobSubmitter doesn't prevent rebuild jobs from being 
started multiple times  (was: PhoenixMRJobSubmitter does not get the correct 
list of jobs to submit for IndexTool jobs)

> PhoenixMRJobSubmitter doesn't prevent rebuild jobs from being started 
> multiple times
> ------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4984
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4984
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Chinmay Kulkarni
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>
> When trying to get the list of jobs to submit, we get the already scheduled 
> jobs list from the Yarn Resource Manager and exclude those jobs inside 
> {{PhoenixMRJobSubmitter#getJobsToSubmit}}, however a naming format difference 
> prevents correctly removing already running/submitted jobs.
> In {{IndexTool.java}}, we use the following convention for naming the M/R job:
>  INDEX_JOB_NAME_TEMPLATE = "PHOENIX_<schema name>*.*<data table 
> name>_INDX_<index name>";
> However, I see the following log lines for candidate jobs:
> _Candidate Indexes to be built as seen from SYSTEM.CATALOG - PHOENIX_<data 
> table name>_INDX_<index name> ... _
> And the following for already submitted jobs as got from Yarn:
> _Already Submitted/Running MR index build jobs - [PHOENIX_<schema name>.<data 
> table name>_INDX_<index name>]_
> Due to this naming conflict (no '.'), even though an index build M/R job is 
> running for a given index, this is not detected correctly and another one can 
> be started for the same index. This can lead to unnecessary load on the 
> region servers hosting regions for the index



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to