[jira] [Closed] (FLINK-4069) Kafka Consumer should not initialize on construction

Shannon Carey (JIRA) Tue, 14 Jun 2016 16:12:02 -0700

     [ 
https://issues.apache.org/jira/browse/FLINK-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shannon Carey closed FLINK-4069.
--------------------------------
    Resolution: Duplicate

> Kafka Consumer should not initialize on construction
> ----------------------------------------------------
>
>                 Key: FLINK-4069
>                 URL: https://issues.apache.org/jira/browse/FLINK-4069
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kafka Connector
>    Affects Versions: 1.0.3
>            Reporter: Shannon Carey
>
> The Kafka Consumer connector currently interacts over the network with Kafka 
> in order to get partition metadata when the class is constructed. Instead, it 
> should do that work when the job actually begins to run (for example, in 
> AbstractRichFunction#open() of FlinkKafkaConsumer0?).
> The main weakness of broker querying in the constructor is that if there are 
> network problems, Flink might take a long time (eg. ~1hr) inside the 
> user-supplied main() method while it attempts to contact each broker and 
> perform retries. In general, setting up the Kafka partitions does not seem 
> strictly necessary as part of execution of main() in order to set up the job 
> plan/topology.
> However, as Robert Metzger mentions, there are important concerns with how 
> Kafka partitions are handled:
> "The main reason why we do the querying centrally is:
> a) avoid overloading the brokers
> b) send the same list of partitions (in the same order) to all parallel 
> consumers to do a fixed partition assignments (also across restarts). When we 
> do the querying in the open() method, we need to make sure that all 
> partitions are assigned, without duplicates (also after restarts in case of 
> failures)."
> See also the mailing list discussion: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/API-request-to-submit-job-takes-over-1hr-td7319.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-4069) Kafka Consumer should not initialize on construction

Reply via email to