[
https://issues.apache.org/jira/browse/OOZIE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated OOZIE-1844:
--------------------------------------
Summary: HA - Lock mechanism for CoordMaterializeTriggerService (was: HA
- Lock mechanism for CoordMaterializeTriggerService ( may be for other
services as well))
> HA - Lock mechanism for CoordMaterializeTriggerService
> -------------------------------------------------------
>
> Key: OOZIE-1844
> URL: https://issues.apache.org/jira/browse/OOZIE-1844
> Project: Oozie
> Issue Type: Bug
> Components: HA
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Attachments: OOZIE-1844-V2.patch
>
>
> Currently we check if job id belong to this server by using modulus operation.
> This may not be optimum way to do.
> 1. We are not processing MATERIALIZATION_SYSTEM_LIMIT, each server is only
> doing half (in case of two servers) processing. We can always double the
> limit. But as we add new system, we need to restart whole cluster to
> increase the limit.
> 2. The job sequence id is shared among wf,coord,bundle. So, we could have a
> case where coord with odd/even id is more. In that case we are not distribute
> load. One server will always do more processing.
> 3. We also have different frequency for different coord jobs. Job with 1 min
> or 5 min frequency will put more load on system. In this approach one
> particular job will always run in one system and eventually putting more load
> on one server.
> May be simple way to optimize is to have a lock mechanism, each
> CoordMaterializeTriggerService will obtain a lock and materialize coord. If
> lock is held by other system, then it will wait for other system to release
> lock. In this way coord jobs will get distributed among servers.
--
This message was sent by Atlassian JIRA
(v6.2#6252)