[jira] [Commented] (FLINK-10407) Reactive container mode

2020-09-04 Thread Antonio Verardi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190775#comment-17190775
 ] 

Antonio Verardi commented on FLINK-10407:
-

Hi [~trohrmann], I have seen Reactive Scaling in the list for version 1.12 
[https://cwiki.apache.org/confluence/display/FLINK/1.12+Release]

Is that likely to happen? This ticket haven't seen much progress lately and I 
don't see the version label, that's why I am asking :P

> Reactive container mode
> ---
>
> Key: FLINK-10407
> URL: https://issues.apache.org/jira/browse/FLINK-10407
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>
> The reactive container mode is a new operation mode where a Flink cluster 
> will react to newly available resources (e.g. started by an external service) 
> and make use of them by rescaling the existing job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-8093) flink job fail because of kafka producer create fail of "javax.management.InstanceAlreadyExistsException"

2020-04-09 Thread Antonio Verardi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079212#comment-17079212
 ] 

Antonio Verardi commented on FLINK-8093:


I confirm this happens both on Flink 1.6 and Flink 1.9:
{code:java}
org.apache.kafka.common.KafkaException: Error registering mbean 
kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--2
at 
org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:163)
at 
org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:81)
at 
org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:504)
at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:255)
at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:240)
at 
org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:811)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:353)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:433)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:184)
at 
org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:314)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1386)
at 
org.apache.flink.streaming.connectors.kafka.internal.Kafka09PartitionDiscoverer.getAllPartitionsForTopics(Kafka09PartitionDiscoverer.java:77)
at 
org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.discoverPartitions(AbstractPartitionDiscoverer.java:131)
at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:473)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:424)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:290)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.management.InstanceAlreadyExistsException: 
kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--2
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:161)
... 22 more {code}

> flink job fail because of kafka producer create fail of 
> "javax.management.InstanceAlreadyExistsException"
> -
>
> Key: FLINK-8093
> URL: https://issues.apache.org/jira/browse/FLINK-8093
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.3.2, 1.10.0
> Environment: flink 1.3.2, kafka 0.9.1
>Reporter: dongtingting
>Assignee: Bastien DINE
>Priority: Critical
>  Labels: usability
>
> one taskmanager has multiple taskslot, one task fail because of create 
> kafkaProducer fail,the reason for create kafkaProducer fail is 
> “javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=producer-metrics,client-id=producer-3”。 the detail trace 
> is :
> {noformat}
> 2017-11-04 19:41:23,281 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom Source -> Filter -> Map -> Filter -> Sink: 
> dp_client_**_log (7/80) (99551f3f892232d7df5eb9060fa9940c) switched from 
> RUNNING to FAILED.
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer
> at 
> 

[jira] [Commented] (FLINK-12122) Spread out tasks evenly across all available registered TaskManagers

2019-10-23 Thread Antonio Verardi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957758#comment-16957758
 ] 

Antonio Verardi commented on FLINK-12122:
-

[~trohrmann] thanks for looking into this. We haven't upgraded any app to 1.9 
yet, so it will take us a bit of time to try the change out. We are working on 
this right now, I'll let you know as soon as we are done

> Spread out tasks evenly across all available registered TaskManagers
> 
>
> Key: FLINK-12122
> URL: https://issues.apache.org/jira/browse/FLINK-12122
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.6.4, 1.7.2, 1.8.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2019-05-21-12-28-29-538.png, 
> image-2019-05-21-13-02-50-251.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With Flip-6, we changed the default behaviour how slots are assigned to 
> {{TaskManages}}. Instead of evenly spreading it out over all registered 
> {{TaskManagers}}, we randomly pick slots from {{TaskManagers}} with a 
> tendency to first fill up a TM before using another one. This is a regression 
> wrt the pre Flip-6 code.
> I suggest to change the behaviour so that we try to evenly distribute slots 
> across all available {{TaskManagers}} by considering how many of their slots 
> are already allocated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10407) Reactive container mode

2019-08-05 Thread Antonio Verardi (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899879#comment-16899879
 ] 

Antonio Verardi commented on FLINK-10407:
-

This feature/re-architecture looks really cool! The document says that the 
target is 1.9.0, but the ticket says 1.7.0. However the 1.9.0 rc1 should be out 
already, so... Which version are you aiming for at the end?

> Reactive container mode
> ---
>
> Key: FLINK-10407
> URL: https://issues.apache.org/jira/browse/FLINK-10407
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>
> The reactive container mode is a new operation mode where a Flink cluster 
> will react to newly available resources (e.g. started by an external service) 
> and make use of them by rescaling the existing job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-12122) Spread out tasks evenly across all available registered TaskManagers

2019-06-10 Thread Antonio Verardi (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860329#comment-16860329
 ] 

Antonio Verardi commented on FLINK-12122:
-

[~till.rohrmann]: Thanks for picking this feature up! It happens to be very 
important for my company, too.

Do you still think you'll manage to ship this one in the 1.9 release or we 
should expect it to come a bit further down the line? We are evaluating whether 
it makes sense for us to implement something custom now or just to wait this 
out.

> Spread out tasks evenly across all available registered TaskManagers
> 
>
> Key: FLINK-12122
> URL: https://issues.apache.org/jira/browse/FLINK-12122
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.6.4, 1.7.2, 1.8.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
> Fix For: 1.7.3, 1.9.0, 1.8.1
>
> Attachments: image-2019-05-21-12-28-29-538.png, 
> image-2019-05-21-13-02-50-251.png
>
>
> With Flip-6, we changed the default behaviour how slots are assigned to 
> {{TaskManages}}. Instead of evenly spreading it out over all registered 
> {{TaskManagers}}, we randomly pick slots from {{TaskManagers}} with a 
> tendency to first fill up a TM before using another one. This is a regression 
> wrt the pre Flip-6 code.
> I suggest to change the behaviour so that we try to evenly distribute slots 
> across all available {{TaskManagers}} by considering how many of their slots 
> are already allocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)