[
https://issues.apache.org/jira/browse/AMBARI-16828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aravindan Vijayan updated AMBARI-16828:
---------------------------------------
Fix Version/s: (was: trunk)
2.5.0
> Support round-robin scheduling with failover for Sinks with distributed
> collector
> ---------------------------------------------------------------------------------
>
> Key: AMBARI-16828
> URL: https://issues.apache.org/jira/browse/AMBARI-16828
> Project: Ambari
> Issue Type: Task
> Components: amvari-me
> Affects Versions: 2.4.1
> Reporter: Siddharth Wagle
> Assignee: Siddharth Wagle
> Fix For: 2.5.0
>
> Attachments: AMBARI-16828.patch
>
>
> - Initial set of collectors is configured in the configuration files
> - Find available collectors by connecting to zookeeper thereafter
> - Remember available collectors, refresh this information only when one
> collector cannot be reached with a very low frequency of checks, example:
> random interval between 10-12 minutes, check if a new collector is available.
> Set a low client side zk timeout.
> - Round robin the write between the collector choosing the first one at random
> - If a write timed out, choose the next available collector, remember the
> attempts with the first one
> - Set a configurable attempt count for failed connector (default = 3), after
> which the failed connector is no longer in the available collectors list.
> - The next retry will be triggered after refresh with zookeeper is successful
> - If no failed collectors available, zk refresh interval should be chosen
> randomly between 1-2 minutes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)