[ https://issues.apache.org/jira/browse/KAFKA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199617#comment-14199617 ]
Gwen Shapira commented on KAFKA-1754: ------------------------------------- Looking forward to see the patch. Streaming applications such as Samza, SparkStreaming and DataTorrents will benefit from running their workers on the same nodes as the partitions they are consuming data from. This is now possible in YARN. My main concern is whether its possible to prevent YARN from spawning new container when a broker goes down. Because starting the broker on a new node has huge overhead of replicating all the data over. We may prefer that this will not happen automatically, but only after someone verified that the original broker is truly gone. > KOYA - Kafka on YARN > -------------------- > > Key: KAFKA-1754 > URL: https://issues.apache.org/jira/browse/KAFKA-1754 > Project: Kafka > Issue Type: New Feature > Reporter: Thomas Weise > Attachments: DT-KOYA-Proposal- JIRA.pdf > > > YARN (Hadoop 2.x) has enabled clusters to be used for a variety of workloads, > emerging as distributed operating system for big data applications. > Initiatives are on the way to bring long running services under the YARN > umbrella, leveraging it for centralized resource management and operations > ([YARN-896] and examples such as HBase, Accumulo or Memcached through > Slider). This JIRA is to propose KOYA (Kafka On Yarn), a YARN application > master to launch and manage Kafka clusters running on YARN. Brokers will use > resources allocated through YARN with support for recovery, monitoring etc. > Please see attached for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)