[jira] [Commented] (FLINK-19028) Translate the "application_parameters.zh.md" into Chinese

2020-10-06 Thread fanyuexiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209302#comment-17209302
 ] 

fanyuexiang commented on FLINK-19028:
-

This is the first time i join the flink community,Please assign this issue to 
me, thanks a lot ~

> Translate the "application_parameters.zh.md"  into Chinese
> --
>
> Key: FLINK-19028
> URL: https://issues.apache.org/jira/browse/FLINK-19028
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation
>Reporter: pp
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19497) Implement mutator methods for FlinkCounterWrapper

2020-10-06 Thread Richard Moorhead (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209262#comment-17209262
 ] 

Richard Moorhead commented on FLINK-19497:
--

A better question might be: is a patch release planned? If so could this commit 
be added to it?

> Implement mutator methods for FlinkCounterWrapper
> -
>
> Key: FLINK-19497
> URL: https://issues.apache.org/jira/browse/FLINK-19497
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Affects Versions: 1.10.2, 1.11.2
>Reporter: Richard Moorhead
>Assignee: Richard Moorhead
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Looking at the dropwizard wrapper classes in flink-metrics-dropwizard, it 
> appears that all of them have mutator methods defined with the exception of 
> FlinkCounterWrapper. We have a use case wherein we mutate counters from a 
> dropwizard context but wish the underlying metrics in the flink registry to 
> be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19517) Support for Confluent Kafka of Table Creation in Flink SQL Client

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'confluent-kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against with 'Confluent Kafka'

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'confluent-kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against with Confluent Kafka (which is 
widely used). For example, Business analysts can give some standard DDL to data 
engineers and data engineers can fill in the WITH clause and immediately start 
executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment


> Support for Confluent Kafka of Table Creation in Flink SQL Client
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example something like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'confluent-kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business analysts want to test against with 'Confluent Kafka'
> Additionally, it will be better if we can
>  - specify 'parallelism' within WITH clause to support parallel 

[jira] [Updated] (FLINK-19517) Support for Confluent Kafka of Table Creation in Flink SQL Client

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'confluent-kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against with Confluent Kafka (which is 
widely used). For example, Business analysts can give some standard DDL to data 
engineers and data engineers can fill in the WITH clause and immediately start 
executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'confluent-kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment


> Support for Confluent Kafka of Table Creation in Flink SQL Client
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example something like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'confluent-kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business 

[jira] [Updated] (FLINK-19517) Support for Confluent Kafka of Table Creation in Flink SQL Client

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'confluent-kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment


> Support for Confluent Kafka of Table Creation in Flink SQL Client
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example something like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'confluent-kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business analysts want to test against. For example, 

[jira] [Updated] (FLINK-19517) Support for Confluent Kafka of Table Creation in Flink SQL Client

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example something like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment


> Support for Confluent Kafka of Table Creation in Flink SQL Client
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example something like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka-confluent',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business analysts want to test against. For example, 

[jira] [Updated] (FLINK-19517) Support for Confluent Kafka of Table Creation in Flink SQL Client

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Summary: Support for Confluent Kafka of Table Creation in Flink SQL Client  
(was: Support for Confluent Kafka on table creation)

> Support for Confluent Kafka of Table Creation in Flink SQL Client
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example some think like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka-confluent',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business analysts want to test against. For example, Business 
> analysts can give some standard DDL to data engineers and data engineers can 
> fill in the WITH clause and immediately start executing queries against these 
> tables
> Additionally, it will be better if we can
>  - specify 'parallelism' within WITH clause to support parallel partition 
> processing
>  - specify custom properties within WITH clause specified in 
> [https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
>  - have remote access to SQL client in cluster from local environment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19517) Support for Confluent Kafka on table creation

2020-10-06 Thread Kevin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Kwon updated FLINK-19517:
---
Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly 
that business analysts want to test against. For example, Business analysts can 
give some standard DDL to data engineers and data engineers can fill in the 
WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]


> Support for Confluent Kafka on table creation
> -
>
> Key: FLINK-19517
> URL: https://issues.apache.org/jira/browse/FLINK-19517
> Project: Flink
>  Issue Type: Wish
>Affects Versions: 1.12.0
>Reporter: Kevin Kwon
>Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka 
> configuration as well. For example some think like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka-confluent',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries 
> on-the-fly that business analysts want to test against. For example, Business 
> analysts can give some standard DDL to data engineers and data engineers can 
> fill in the WITH clause and immediately start executing queries against these 
> tables
> Additionally, it will be better if we can
>  - specify 'parallelism' within WITH clause to support parallel partition 
> processing
>  - specify custom properties within WITH clause specified in 
> 

[jira] [Created] (FLINK-19517) Support for Confluent Kafka on table creation

2020-10-06 Thread Kevin Kwon (Jira)
Kevin Kwon created FLINK-19517:
--

 Summary: Support for Confluent Kafka on table creation
 Key: FLINK-19517
 URL: https://issues.apache.org/jira/browse/FLINK-19517
 Project: Flink
  Issue Type: Wish
Affects Versions: 1.12.0
Reporter: Kevin Kwon


Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka 
configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition 
processing
 - specify custom properties within WITH clause specified in 
[https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * ec6d68e0c67f14994082adb24e4f86dd599fc9f0 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7255)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lvhuyen commented on pull request #6613: [FLINK-9940] [API/DataStream][File source] out-of-order files were missed in continuous monitoring

2020-10-06 Thread GitBox


lvhuyen commented on pull request #6613:
URL: https://github.com/apache/flink/pull/6613#issuecomment-704578210


   Thanks @guoweiM.
   Please give me a bit more time to arrange my time working on this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * 5f4afd19fde91ffe545666daffd8953b126de7bf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7246)
 
   * ec6d68e0c67f14994082adb24e4f86dd599fc9f0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7255)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * 5f4afd19fde91ffe545666daffd8953b126de7bf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7246)
 
   * ec6d68e0c67f14994082adb24e4f86dd599fc9f0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #13313: [FLINK-18722][mesos] Migrate MesosResourceManager to the new MesosResourceManagerDriver

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #13313:
URL: https://github.com/apache/flink/pull/13313#discussion_r500423676



##
File path: 
flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosResourceManager.java
##
@@ -882,7 +820,7 @@ public void run() {
/**
 * Adapts incoming Akka messages as RPC calls to the resource manager.
 */
-   private class AkkaAdapter extends UntypedAbstractActor {
+   public class AkkaAdapter extends UntypedAbstractActor {

Review comment:
   I'd suggest to make this class package private and to move 
MesosResourceManagerActorFactory into the package of the `MesosResourceManager`.

##
File path: 
flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosResourceManagerDriver.java
##
@@ -0,0 +1,713 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.mesos.runtime.clusterframework;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.GlobalConfiguration;
+import 
org.apache.flink.mesos.runtime.clusterframework.actors.MesosResourceManagerActorFactory;
+import org.apache.flink.mesos.runtime.clusterframework.services.MesosServices;
+import org.apache.flink.mesos.runtime.clusterframework.store.MesosWorkerStore;
+import org.apache.flink.mesos.scheduler.ConnectionMonitor;
+import org.apache.flink.mesos.scheduler.LaunchCoordinator;
+import org.apache.flink.mesos.scheduler.ReconciliationCoordinator;
+import org.apache.flink.mesos.scheduler.TaskMonitor;
+import org.apache.flink.mesos.scheduler.TaskSchedulerBuilder;
+import org.apache.flink.mesos.scheduler.messages.AcceptOffers;
+import org.apache.flink.mesos.scheduler.messages.Disconnected;
+import org.apache.flink.mesos.scheduler.messages.OfferRescinded;
+import org.apache.flink.mesos.scheduler.messages.ReRegistered;
+import org.apache.flink.mesos.scheduler.messages.Registered;
+import org.apache.flink.mesos.scheduler.messages.ResourceOffers;
+import org.apache.flink.mesos.scheduler.messages.StatusUpdate;
+import org.apache.flink.mesos.util.MesosArtifactServer;
+import org.apache.flink.mesos.util.MesosConfiguration;
+import org.apache.flink.runtime.clusterframework.ApplicationStatus;
+import org.apache.flink.runtime.clusterframework.ContainerSpecification;
+import org.apache.flink.runtime.clusterframework.TaskExecutorProcessSpec;
+import org.apache.flink.runtime.clusterframework.types.ResourceID;
+import org.apache.flink.runtime.concurrent.FutureUtils;
+import org.apache.flink.runtime.resourcemanager.WorkerResourceSpec;
+import 
org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver;
+import org.apache.flink.runtime.resourcemanager.active.ResourceManagerDriver;
+import 
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.Preconditions;
+
+import akka.actor.ActorRef;
+import akka.actor.UntypedAbstractActor;
+import com.netflix.fenzo.TaskRequest;
+import com.netflix.fenzo.TaskScheduler;
+import com.netflix.fenzo.VirtualMachineLease;
+import com.netflix.fenzo.functions.Action1;
+import org.apache.mesos.Protos;
+import org.apache.mesos.Scheduler;
+import org.apache.mesos.SchedulerDriver;
+
+import javax.annotation.Nullable;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.CompletionException;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import scala.Option;
+import scala.concurrent.duration.FiniteDuration;
+
+/**
+ * Implementation of {@link ResourceManagerDriver} for Mesos deployment.
+ */
+public class MesosResourceManagerDriver extends 
AbstractResourceManagerDriver {
+
+   /** The Mesos configuration (master and framework info). */
+   private final MesosConfiguration mesosConfig;
+
+   /** The Mesos services needed by the resource manager. */
+   

[GitHub] [flink] flinkbot edited a comment on pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13544:
URL: https://github.com/apache/flink/pull/13544#issuecomment-704178057


   
   ## CI report:
   
   * bbba3f53abb4d4d6c220f8a87fa31c395d5c4f86 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7248)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13540: [FLINK-19344] Fix DispatcherResourceCleanupTest race condition

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13540:
URL: https://github.com/apache/flink/pull/13540#issuecomment-703484500


   
   ## CI report:
   
   * 8842d1927613cdcd09f4e6c671c9cb1dc671269e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7247)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * 5f4afd19fde91ffe545666daffd8953b126de7bf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7246)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19515) Async RequestReply handler concurrency bug

2020-10-06 Thread Igal Shilman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209119#comment-17209119
 ] 

Igal Shilman commented on FLINK-19515:
--

Hi,
I think that you are correct, and it does seems like a bug.

The InvocationContext should be created in __call__ instead of the constructor. 
This also has to happen in the regular RequestRelplyHander.

Would you like to create a PR with the fix?


> Async RequestReply handler concurrency bug
> --
>
> Key: FLINK-19515
> URL: https://issues.apache.org/jira/browse/FLINK-19515
> Project: Flink
>  Issue Type: Bug
>Affects Versions: statefun-2.2.0
>Reporter: Frans King
>Priority: Minor
>
> Async RequestReply handler implemented in 
> https://issues.apache.org/jira/browse/FLINK-18518 has a concurrency problem.
>  
> Lines 151 to 152 of 
> [https://github.com/apache/flink-statefun/blob/master/statefun-python-sdk/statefun/request_reply.py]
> The coro is awaiting and may yield.  Another coro may continue that was 
> yielding and call ic.complete() which sets the ic.context to None
>  
> In short:
>  
> {code:java}
> ic.setup(request_bytes)
> await self.handle_invocation(ic)
> return ic.complete()
>  
> {code}
> Needs to happen atomically.
>  
> I worked around this by creating an AsyncRequestReplyHandler for each request.
>  
> It should be possible to re-produce this by putting an await asyncio.sleep(5) 
> in the greeter example and then run in gunicorn with a single asyncio 
> thread/event loop (-w 1).  
>  
>  
>  
> {code:java}
>     response_data = await handler(request_data)
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 152, in __call__
>     return ic.complete()
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 57, in complete
>     self.add_mutations(context, invocation_result)
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 82, in add_mutations
>     for name, handle in context.states.items():
> AttributeError: 'NoneType' object has no attribute 'states'
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13549: [FLINK-18660] Bump flink-shaded to 1.12; Bump netty to 4.1.49

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13549:
URL: https://github.com/apache/flink/pull/13549#issuecomment-704349970


   
   ## CI report:
   
   * aa06dc0c462dcba26088f707a33e3d724df53023 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7245)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-15578) Implement exactly-once JDBC sink

2020-10-06 Thread Kenzyme Le (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209035#comment-17209035
 ] 

Kenzyme Le commented on FLINK-15578:


Thank you for the update [~roman_khachatryan] . I'll look into how I can 
integrate GenericWriteAheadSink in my code base.

Meanwhile, if you or [~chesnay] have any other suggestions or recommendations, 
I'm all ears!

> Implement exactly-once JDBC sink
> 
>
> Key: FLINK-15578
> URL: https://issues.apache.org/jira/browse/FLINK-15578
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / JDBC
>Reporter: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As per discussion in the dev mailing list, there are two options:
>  # Write-ahead log
>  # Two-phase commit (XA)
> the latter being preferable.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] GJL commented on a change in pull request #6464: [FLINK-9936][mesos] Mesos resource manager unable to connect to master after failover

2020-10-06 Thread GitBox


GJL commented on a change in pull request #6464:
URL: https://github.com/apache/flink/pull/6464#discussion_r500526730



##
File path: 
flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosResourceManager.java
##
@@ -794,7 +828,7 @@ public void run() {
 
@Override
public void disconnected(SchedulerDriver driver) {
-   runAsync(new Runnable() {
+   runAsyncWithoutFencing(new Runnable() {

Review comment:
   No, I don't remember anymore. Does this line cause problems? To 
understand this line it could be helpful to change it to `runAsync` and run 
`flink-jepsen`. I am open to discuss it in a call.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19474) Implement a state backend that holds a single key at a time

2020-10-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19474:
---
Labels: pull-request-available  (was: )

> Implement a state backend that holds a single key at a time
> ---
>
> Key: FLINK-19474
> URL: https://issues.apache.org/jira/browse/FLINK-19474
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13550: [FLINK-19474] Implement a state backend that holds a single key at a time

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13550:
URL: https://github.com/apache/flink/pull/13550#issuecomment-704399024


   
   ## CI report:
   
   * 5e658efd970cd23de2637876e59becdbf00a8921 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7250)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13529: [FLINK-19473] Implement multi inputs sorting DataInput

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13529:
URL: https://github.com/apache/flink/pull/13529#issuecomment-702193043


   
   ## CI report:
   
   * 055e55a0e4e55637a5896a173939e00e6991abe4 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7240)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19497) Implement mutator methods for FlinkCounterWrapper

2020-10-06 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209011#comment-17209011
 ] 

Chesnay Schepler commented on FLINK-19497:
--

I would be hesitant since it is conceivable that it changes behavior; if there 
is code out there incrementing the counter from the dropwizard side without the 
user being aware of it.
It's also not really a bug; the wrappers weren't really intended to be used in 
a general-purpose fashion and mostly exist for our reporter use-case; another 
reminder to only make things public if you really need to :/

It should fortunately be straight-forward to temporarily use a copy of the 
modified class in your project.

> Implement mutator methods for FlinkCounterWrapper
> -
>
> Key: FLINK-19497
> URL: https://issues.apache.org/jira/browse/FLINK-19497
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Affects Versions: 1.10.2, 1.11.2
>Reporter: Richard Moorhead
>Assignee: Richard Moorhead
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Looking at the dropwizard wrapper classes in flink-metrics-dropwizard, it 
> appears that all of them have mutator methods defined with the exception of 
> FlinkCounterWrapper. We have a use case wherein we mutate counters from a 
> dropwizard context but wish the underlying metrics in the flink registry to 
> be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704265372


   
   ## CI report:
   
   * 20c70c0fec1ceb77e60e138bb334bb48f5bcadcf Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7238)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19497) Implement mutator methods for FlinkCounterWrapper

2020-10-06 Thread Richard Moorhead (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208975#comment-17208975
 ] 

Richard Moorhead commented on FLINK-19497:
--

Is it reasonable to request this be added to a patch release?

> Implement mutator methods for FlinkCounterWrapper
> -
>
> Key: FLINK-19497
> URL: https://issues.apache.org/jira/browse/FLINK-19497
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Affects Versions: 1.10.2, 1.11.2
>Reporter: Richard Moorhead
>Assignee: Richard Moorhead
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Looking at the dropwizard wrapper classes in flink-metrics-dropwizard, it 
> appears that all of them have mutator methods defined with the exception of 
> FlinkCounterWrapper. We have a use case wherein we mutate counters from a 
> dropwizard context but wish the underlying metrics in the flink registry to 
> be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] asfgit merged pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


asfgit merged pull request #13548:
URL: https://github.com/apache/flink/pull/13548


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19516) PerJobMiniClusterFactoryTest. testJobClient()

2020-10-06 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19516:


 Summary: PerJobMiniClusterFactoryTest. testJobClient()
 Key: FLINK-19516
 URL: https://issues.apache.org/jira/browse/FLINK-19516
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Stephan Ewen


*Log:*
https://dev.azure.com/sewen0794/19b23adf-d190-4fb4-ae6e-2e92b08923a3/_apis/build/builds/151/logs/137

*Exception:*
{code}
[ERROR] 
testJobClient(org.apache.flink.client.program.PerJobMiniClusterFactoryTest)  
Time elapsed: 0.392 s  <<< FAILURE!
java.lang.AssertionError: 

Expected: is 
 but: was 
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.flink.client.program.PerJobMiniClusterFactoryTest.assertThatMiniClusterIsShutdown(PerJobMiniClusterFactoryTest.java:161)
at 
org.apache.flink.client.program.PerJobMiniClusterFactoryTest.testJobClient(PerJobMiniClusterFactoryTest.java:93)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704335711


   
   ## CI report:
   
   * 3e996d9860a1b1f5431a2c3abee8f1c41b7bbd00 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7243)
 
   * d848698ca39ad0d2e6cca340ff183b5aeb00fdf9 UNKNOWN
   * 16fb79efc9e2e70e0ee3f99d109d11b812f308f7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7249)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13550: [FLINK-19474] Implement a state backend that holds a single key at a time

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13550:
URL: https://github.com/apache/flink/pull/13550#issuecomment-704399024


   
   ## CI report:
   
   * 5e658efd970cd23de2637876e59becdbf00a8921 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7250)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13540: [FLINK-19344] Fix DispatcherResourceCleanupTest race condition

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13540:
URL: https://github.com/apache/flink/pull/13540#issuecomment-703484500


   
   ## CI report:
   
   * 6650579aa244c53d76237be84b861224c451b4ba Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7199)
 
   * 8842d1927613cdcd09f4e6c671c9cb1dc671269e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7247)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #6464: [FLINK-9936][mesos] Mesos resource manager unable to connect to master after failover

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #6464:
URL: https://github.com/apache/flink/pull/6464#discussion_r500456079



##
File path: 
flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosResourceManager.java
##
@@ -794,7 +828,7 @@ public void run() {
 
@Override
public void disconnected(SchedulerDriver driver) {
-   runAsync(new Runnable() {
+   runAsyncWithoutFencing(new Runnable() {

Review comment:
   @GJL do you still remember why this message needs to be run w/o fencing?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13550: [FLINK-19474] Implement a state backend that holds a single key at a time

2020-10-06 Thread GitBox


flinkbot commented on pull request #13550:
URL: https://github.com/apache/flink/pull/13550#issuecomment-704399024


   
   ## CI report:
   
   * 5e658efd970cd23de2637876e59becdbf00a8921 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13529: [FLINK-19473] Implement multi inputs sorting DataInput

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13529:
URL: https://github.com/apache/flink/pull/13529#issuecomment-702193043


   
   ## CI report:
   
   * d91430e1c66451806b07692731574192c09eca89 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7237)
 
   * 055e55a0e4e55637a5896a173939e00e6991abe4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7240)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] dawidwys opened a new pull request #13550: [FLINK-19474] Implement a state backend that holds a single key at a time

2020-10-06 Thread GitBox


dawidwys opened a new pull request #13550:
URL: https://github.com/apache/flink/pull/13550


   ## What is the purpose of the change
   
   This commit introduces a SingleKeyStateBackend. This state backend is a
   simplified version of a state backend that can be used in a BATCH
   runtime mode. It requires the input to be sorted, as it only ever
   remembers the current key. If the key changes, the current state is
   discarded. Moreover this state backend does not support checkpointing.
   
   
   ## Verifying this change
   
   This change added tests:
   * 
org.apache.flink.streaming.api.operators.sorted.state.SingleKeyStateBackendVerificationTest
   * 
org.apache.flink.streaming.api.operators.sorted.state.SingleKeyStateBackendTest
   * 
org.apache.flink.streaming.api.operators.sorted.state.SingleKeyKeyGroupedInternalPriorityQueueTest
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (**yes** / no 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
**JavaDocs** / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13544:
URL: https://github.com/apache/flink/pull/13544#issuecomment-704178057







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * 3e38dfe739fa34ed301dc2c0303086568f138d57 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7213)
 
   * 5f4afd19fde91ffe545666daffd8953b126de7bf Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7246)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13550: [FLINK-19474] Implement a state backend that holds a single key at a time

2020-10-06 Thread GitBox


flinkbot commented on pull request #13550:
URL: https://github.com/apache/flink/pull/13550#issuecomment-704387404


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 5e658efd970cd23de2637876e59becdbf00a8921 (Tue Oct 06 
16:10:54 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


zentol commented on a change in pull request #13544:
URL: https://github.com/apache/flink/pull/13544#discussion_r500420727



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java
##
@@ -0,0 +1,440 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.resourcemanager.slotmanager;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.time.Time;
+import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
+import org.apache.flink.runtime.clusterframework.types.SlotID;
+import org.apache.flink.runtime.concurrent.ScheduledExecutor;
+import org.apache.flink.runtime.instance.InstanceID;
+import org.apache.flink.runtime.resourcemanager.WorkerResourceSpec;
+import 
org.apache.flink.runtime.resourcemanager.registration.TaskExecutorConnection;
+import org.apache.flink.runtime.slots.ResourceRequirement;
+import org.apache.flink.runtime.taskexecutor.SlotReport;
+import org.apache.flink.runtime.taskexecutor.SlotStatus;
+import org.apache.flink.util.FlinkException;
+import org.apache.flink.util.MathUtils;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.Executor;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+
+/**
+ * SlotManager component for all task executor related things.
+ *
+ * Dev note: This component only exists to keep the code out of the slot 
manager.
+ * It covers many aspects that aren't really the responsibility of the slot 
manager, and should be refactored to live
+ * outside the slot manager and split into multiple parts.
+ */
+class TaskExecutorManager implements AutoCloseable {
+   private static final Logger LOG = 
LoggerFactory.getLogger(TaskExecutorManager.class);
+
+   private final ResourceProfile defaultSlotResourceProfile;
+
+   /** The default resource spec of workers to request. */
+   private final WorkerResourceSpec defaultWorkerResourceSpec;

Review comment:
   They are used for allocating resources. It wouldn't necessarily need it 
as they could be injected by the `ResourceActions`, but I did not want to do 
such refactorings.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol edited a comment on pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


zentol edited a comment on pull request #13544:
URL: https://github.com/apache/flink/pull/13544#issuecomment-704383090







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


zentol commented on pull request #13544:
URL: https://github.com/apache/flink/pull/13544#issuecomment-704383090


   The TEM contains literally all logic that is specific to task executors. 
That was the entire thought process.
   
   Should the SlotManager (SM) handle pending slots? Maybe. But I could also 
envision a worker allocator that queries the SM for required resources and 
allocates workers as necessary, fulfilling all current conditions that exist. 
That would then mean the SM only ever works with actually existing slots via a 
single code path (the registration of a TM / it's slots).
   The SM does not necessarily need to know about pending slots; it already 
doesn't now it in case of external systems starting task executors anyway. The 
only difference in behavior is how many notifications the JM receives, which 
shouldn't be a problem due to the proposed stabilization protocol.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-17259) Have scala 2.12 support

2020-10-06 Thread Seth Wiesman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208857#comment-17208857
 ] 

Seth Wiesman commented on FLINK-17259:
--

StateFun 2.2 dropped scala 2.11 support in favor of 2.12. Closing this.

> Have scala 2.12 support
> ---
>
> Key: FLINK-17259
> URL: https://issues.apache.org/jira/browse/FLINK-17259
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-2.0.0
>Reporter: João Boto
>Priority: Major
>
> In statefun-flink is defined the scala.binary.version as 2.11
> this force to use this the use of scala 2.11
>  
> should be the default 2.12? or have the option to chose the scala version



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-17259) Have scala 2.12 support

2020-10-06 Thread Seth Wiesman (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seth Wiesman closed FLINK-17259.

Resolution: Duplicate

> Have scala 2.12 support
> ---
>
> Key: FLINK-17259
> URL: https://issues.apache.org/jira/browse/FLINK-17259
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-2.0.0
>Reporter: João Boto
>Priority: Major
>
> In statefun-flink is defined the scala.binary.version as 2.11
> this force to use this the use of scala 2.11
>  
> should be the default 2.12? or have the option to chose the scala version



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704335711


   
   ## CI report:
   
   * 3e996d9860a1b1f5431a2c3abee8f1c41b7bbd00 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7243)
 
   * d848698ca39ad0d2e6cca340ff183b5aeb00fdf9 UNKNOWN
   * 16fb79efc9e2e70e0ee3f99d109d11b812f308f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13549: [FLINK-18660] Bump flink-shaded to 1.12; Bump netty to 4.1.49

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13549:
URL: https://github.com/apache/flink/pull/13549#issuecomment-704349970


   
   ## CI report:
   
   * aa06dc0c462dcba26088f707a33e3d724df53023 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7245)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13540: [FLINK-19344] Fix DispatcherResourceCleanupTest race condition

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13540:
URL: https://github.com/apache/flink/pull/13540#issuecomment-703484500


   
   ## CI report:
   
   * 6650579aa244c53d76237be84b861224c451b4ba Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7199)
 
   * 8842d1927613cdcd09f4e6c671c9cb1dc671269e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13351: [FLINK-18990][task] Read channel state sequentially

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13351:
URL: https://github.com/apache/flink/pull/13351#issuecomment-688741492


   
   ## CI report:
   
   * 3e38dfe739fa34ed301dc2c0303086568f138d57 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7213)
 
   * 5f4afd19fde91ffe545666daffd8953b126de7bf UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem

2020-10-06 Thread Ben Augarten (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208844#comment-17208844
 ] 

Ben Augarten commented on FLINK-19481:
--

[~yunta] yes, that's right. Checkpoint on GCS via the hadoop filesystem 
currently works well as far as I'm aware.

> Add support for a flink native GCS FileSystem
> -
>
> Key: FLINK-19481
> URL: https://issues.apache.org/jira/browse/FLINK-19481
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.12.0
>Reporter: Ben Augarten
>Priority: Major
> Fix For: 1.12.0
>
>
> Currently, GCS is supported but only by using the hadoop connector[1]
>  
> The objective of this improvement is to add support for checkpointing to 
> Google Cloud Storage with the Flink File System,
>  
> This would allow the `gs://` scheme to be used for savepointing and 
> checkpointing. Long term, it would be nice if we could use the GCS FileSystem 
> as a source and sink in flink jobs as well. 
>  
> Long term, I hope that implementing a flink native GCS FileSystem will 
> simplify usage of GCS because the hadoop FileSystem ends up bringing in many 
> unshaded dependencies.
>  
> [1] 
> [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] StephanEwen commented on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


StephanEwen commented on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704351300


   Adjusted the code to make the role of the utility class more clear.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13549: [FLINK-18660] Bump flink-shaded to 1.12; Bump netty to 4.1.49

2020-10-06 Thread GitBox


flinkbot commented on pull request #13549:
URL: https://github.com/apache/flink/pull/13549#issuecomment-704349970


   
   ## CI report:
   
   * aa06dc0c462dcba26088f707a33e3d724df53023 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704335711


   
   ## CI report:
   
   * 3e996d9860a1b1f5431a2c3abee8f1c41b7bbd00 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7243)
 
   * d848698ca39ad0d2e6cca340ff183b5aeb00fdf9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19515) Async RequestReply handler concurrency bug

2020-10-06 Thread Frans King (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frans King updated FLINK-19515:
---
Description: 
Async RequestReply handler implemented in 
https://issues.apache.org/jira/browse/FLINK-18518 has a concurrency problem.

 

Lines 151 to 152 of 
[https://github.com/apache/flink-statefun/blob/master/statefun-python-sdk/statefun/request_reply.py]

The coro is awaiting and may yield.  Another coro may continue that was 
yielding and call ic.complete() which sets the ic.context to None

 

In short:

 
{code:java}
ic.setup(request_bytes)
await self.handle_invocation(ic)
return ic.complete()
 
{code}
Needs to happen atomically.

 

I worked around this by creating an AsyncRequestReplyHandler for each request.

 

It should be possible to re-produce this by putting an await asyncio.sleep(5) 
in the greeter example and then run in gunicorn with a single asyncio 
thread/event loop (-w 1).  

 

 

 
{code:java}
    response_data = await handler(request_data)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 152, in __call__
    return ic.complete()
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 57, in complete
    self.add_mutations(context, invocation_result)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 82, in add_mutations
    for name, handle in context.states.items():
AttributeError: 'NoneType' object has no attribute 'states'
{code}
 

  was:
Async RequestReply handler implemented in 
https://issues.apache.org/jira/browse/FLINK-18518 has a concurrency problem.

 

Lines 151 to 152 of 
[https://github.com/apache/flink-statefun/blob/master/statefun-python-sdk/statefun/request_reply.py]

The coro is awaiting and may yield.  Another coro may continue that was 
yielding and call ic.complete() which sets the ic.context to None

 

In short:

 
{code:java}
ic.setup(request_bytes)
await self.handle_invocation(ic)
return ic.complete()
 
{code}
Needs to happen atomically.

 

I worked around this by creating an AsyncRequestReplyHandler for each request.

 

It should be possible to re-produce this by putting an await asyncio.sleep(5) 
in the greeter example and then run in gunicorn with a single asyncio 
thread/event loop.  

 

 

 
{code:java}
    response_data = await handler(request_data)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 152, in __call__
    return ic.complete()
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 57, in complete
    self.add_mutations(context, invocation_result)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 82, in add_mutations
    for name, handle in context.states.items():
AttributeError: 'NoneType' object has no attribute 'states'
{code}
 


> Async RequestReply handler concurrency bug
> --
>
> Key: FLINK-19515
> URL: https://issues.apache.org/jira/browse/FLINK-19515
> Project: Flink
>  Issue Type: Bug
>Affects Versions: statefun-2.2.0
>Reporter: Frans King
>Priority: Minor
>
> Async RequestReply handler implemented in 
> https://issues.apache.org/jira/browse/FLINK-18518 has a concurrency problem.
>  
> Lines 151 to 152 of 
> [https://github.com/apache/flink-statefun/blob/master/statefun-python-sdk/statefun/request_reply.py]
> The coro is awaiting and may yield.  Another coro may continue that was 
> yielding and call ic.complete() which sets the ic.context to None
>  
> In short:
>  
> {code:java}
> ic.setup(request_bytes)
> await self.handle_invocation(ic)
> return ic.complete()
>  
> {code}
> Needs to happen atomically.
>  
> I worked around this by creating an AsyncRequestReplyHandler for each request.
>  
> It should be possible to re-produce this by putting an await asyncio.sleep(5) 
> in the greeter example and then run in gunicorn with a single asyncio 
> thread/event loop (-w 1).  
>  
>  
>  
> {code:java}
>     response_data = await handler(request_data)
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 152, in __call__
>     return ic.complete()
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 57, in complete
>     self.add_mutations(context, invocation_result)
>   File 
> "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", line 
> 82, in add_mutations
>     for name, handle in context.states.items():
> AttributeError: 'NoneType' object has no attribute 'states'
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-19441) Performance regression on 24.09.2020

2020-10-06 Thread Stephan Ewen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208828#comment-17208828
 ] 

Stephan Ewen edited comment on FLINK-19441 at 10/6/20, 3:19 PM:


 I could not find anything very suspicious in the code. In general, the code 
does look less complex not than before, does not have any new locks or volatile 
variable accesses, and seems to have the same number of virtual method calls:
 *  RecordWriter.emit(T record) (bi-morphic between 
{{ChannelSelectorRecordWriter}} and {{BroadcastRecordWriter}}.
 * The record serialization (to byte buffer) is unchanged
 * Because the record-to-buffer logic is in the {{ResultPartition}} now, there 
is one method on ResultPartition that is called per record now, rather than per 
buffer. That method is not virtual, though, there is only one active 
implementation.

One think we can investigate is the 
{{ConsumableNotifyingResultPartitionWriterDecorator}}. The class is always 
loaded, but never instantiated (only a static method is used). That means in 
CHA the JIT will find two methods for the last method mentioned above, but all 
call sites call the same one, so any profiling logic should be able to cut this 
out again.

Still, it is trivial to change the code so that the 
{{ConsumableNotifyingResultPartitionWriterDecorator}} code is never loaded.
 Here is the PR for that: [https://github.com/apache/flink/pull/13548]


was (Author: stephanewen):
 I could not find anything very suspicious in the code. In general, the code 
does look less complex not than before, does not have any new locks or volatile 
variable accesses, and seems to have the same number of virtual method calls:

  - RecordWriter.emit(T record)  (bi-morphic between 
{{ChannelSelectorRecordWriter}} and {{BroadcastRecordWriter}}.
  - The record serialization (to byte buffer) is unchanged
  - Because the record-to-buffer logic is in the {{ResultPartition}} now, there 
is one method on ResultPartition that is called per record now, rather than per 
buffer. That method is not virtual, though, there is only one active 
implementation.

One think we can investigate is the 
{{ConsumableNotifyingResultPartitionWriterDecorator}}. The class is always 
loaded, but never instantiated (only a static method is used). That means in 
CHA the JIT will find two methods for the last method mentioned above, but all 
call sites call the same one, so any profiling logic should be able to cut this 
out again.

Still, it is trivial to change the code so that the 
{{ConsumableNotifyingResultPartitionWriterDecorator}} code is never loaded.
Here is the PR for that: https://github.com/apache/flink/pull/13548

> Performance regression on 24.09.2020
> 
>
> Key: FLINK-19441
> URL: https://issues.apache.org/jira/browse/FLINK-19441
> Project: Flink
>  Issue Type: Bug
>Reporter: Arvid Heise
>Assignee: Stephan Ewen
>Priority: Blocker
>  Labels: pull-request-available
>
> A couple of benchmarks are showing a small performance regression on 
> 24.09.2020:
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> http://codespeed.dak8s.net:8000/timeline/?ben=tupleKeyBy=2 (?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18518) Add Async RequestReply handler for the Python SDK

2020-10-06 Thread Frans King (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208836#comment-17208836
 ] 

Frans King commented on FLINK-18518:


I think there might be a concurrency bug with the implementation of 
AsyncRequestReplyHandler - https://issues.apache.org/jira/browse/FLINK-19515

 

ic.setup(). <- sets ic.context = some value

await 

ic.complete() <- sets ic.context = None

 

As a result ic.context can be None depending on how the coros yield/awaken 
resulting in

 

line 57, in complete

    self.add_mutations(context, invocation_result)

  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 82, in add_mutations

    for name, handle in context.states.items():

AttributeError: 'NoneType' object has no attribute 'states'

 

 

> Add Async RequestReply handler for the Python SDK
> -
>
> Key: FLINK-18518
> URL: https://issues.apache.org/jira/browse/FLINK-18518
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Affects Versions: statefun-2.1.0
>Reporter: Igal Shilman
>Assignee: Igal Shilman
>Priority: Major
>  Labels: beginner-friendly, pull-request-available
> Fix For: statefun-2.2.0
>
>
> I/O bound stateful functions can benefit from the built-in async/io support 
> in Python, but the 
> RequestReply handler is not an async-io compatible.  See 
> [this|https://stackoverflow.com/questions/62640283/flink-stateful-functions-async-calls-with-the-python-sdk]
>  question on stackoverflow.
>  
> Having an asyncio compatible handler will open the door to the usage of 
> aiohttp for example:
>  
> {code:java}
> import aiohttp
> import asyncio
> ...
> async def fetch(session, url):
> async with session.get(url) as response:
> return await response.text()
> @function.bind("example/hello")
> async def hello(context, message):
> async with aiohttp.ClientSession() as session:
> html = await fetch(session, 'http://python.org')
> context.pack_and_reply(SomeProtobufMessage(html))
> from aiohttp import webhandler 
> handler = AsyncRequestReplyHandler(functions)
> async def handle(request):
> req = await request.read()
> res = await handler(req)
> return web.Response(body=res, content_type="application/octet-stream'")
> app = web.Application()
> app.add_routes([web.post('/statefun', handle)])
> if __name__ == '__main__':
> web.run_app(app, port=5000)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19507) Native support for Apache Pulsar

2020-10-06 Thread John Morrow (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Morrow updated FLINK-19507:

Description: 
Flink StateFun has a sink/source IO module which can be used to plug in 
existing, or custom, Flink connectors that are not already integrated into a 
dedicated I/O module 
([https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html|https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html]).

Currently there are dedicated I/O modules for Apache Kafka and AWS Kinesis. It 
would be nice to add a similar one for Apache Pulsar.

 

  was:
Flink StateFun has a sink/source IO module which can be used to plug in 
existing, or custom, Flink connectors that are not already integrated into a 
dedicated I/O module 
([https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html|https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html).]).

Currently there are dedicated I/O modules for Apache Kafka and AWS Kinesis. It 
would be nice to add a similar one for Apache Pulsar.

 


> Native support for Apache Pulsar
> 
>
> Key: FLINK-19507
> URL: https://issues.apache.org/jira/browse/FLINK-19507
> Project: Flink
>  Issue Type: Wish
>  Components: Stateful Functions
>Affects Versions: statefun-2.2.0
>Reporter: John Morrow
>Priority: Major
>
> Flink StateFun has a sink/source IO module which can be used to plug in 
> existing, or custom, Flink connectors that are not already integrated into a 
> dedicated I/O module 
> ([https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html|https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/io-module/flink-connectors.html]).
> Currently there are dedicated I/O modules for Apache Kafka and AWS Kinesis. 
> It would be nice to add a similar one for Apache Pulsar.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #13549: [FLINK-18660] Bump flink-shaded to 1.12; Bump netty to 4.1.49

2020-10-06 Thread GitBox


flinkbot commented on pull request #13549:
URL: https://github.com/apache/flink/pull/13549#issuecomment-704342067


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit aa06dc0c462dcba26088f707a33e3d724df53023 (Tue Oct 06 
15:15:36 UTC 2020)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger opened a new pull request #13549: [FLINK-18660] Bump flink-shaded to 1.12; Bump netty to 4.1.49

2020-10-06 Thread GitBox


rmetzger opened a new pull request #13549:
URL: https://github.com/apache/flink/pull/13549


   Bump netty to 4.1.49 for some security fixes.
   This also entails bumping netty-tcnative to 2.30.0



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19515) Async RequestReply handler concurrency bug

2020-10-06 Thread Frans King (Jira)
Frans King created FLINK-19515:
--

 Summary: Async RequestReply handler concurrency bug
 Key: FLINK-19515
 URL: https://issues.apache.org/jira/browse/FLINK-19515
 Project: Flink
  Issue Type: Bug
Affects Versions: statefun-2.2.0
Reporter: Frans King


Async RequestReply handler implemented in 
https://issues.apache.org/jira/browse/FLINK-18518 has a concurrency problem.

 

Lines 151 to 152 of 
[https://github.com/apache/flink-statefun/blob/master/statefun-python-sdk/statefun/request_reply.py]

The coro is awaiting and may yield.  Another coro may continue that was 
yielding and call ic.complete() which sets the ic.context to None

 

In short:

 
{code:java}
ic.setup(request_bytes)
await self.handle_invocation(ic)
return ic.complete()
 
{code}
Needs to happen atomically.

 

I worked around this by creating an AsyncRequestReplyHandler for each request.

 

It should be possible to re-produce this by putting an await asyncio.sleep(5) 
in the greeter example and then run in gunicorn with a single asyncio 
thread/event loop.  

 

 

 
{code:java}
    response_data = await handler(request_data)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 152, in __call__
    return ic.complete()
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 57, in complete
    self.add_mutations(context, invocation_result)
  File "/home/pi/.local/lib/python3.7/site-packages/statefun/request_reply.py", 
line 82, in add_mutations
    for name, handle in context.states.items():
AttributeError: 'NoneType' object has no attribute 'states'
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] fsk119 commented on pull request #13449: [FLINK-19282][table sql/planner]Supports watermark push down with Wat…

2020-10-06 Thread GitBox


fsk119 commented on pull request #13449:
URL: https://github.com/apache/flink/pull/13449#issuecomment-704336548


   Thanks for your reply. Looking forward for your suggestions! 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


flinkbot commented on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704335711


   
   ## CI report:
   
   * 3e996d9860a1b1f5431a2c3abee8f1c41b7bbd00 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19501) Missing state in enum type in rest_api_v1.snapshot

2020-10-06 Thread goutham (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208833#comment-17208833
 ] 

goutham commented on FLINK-19501:
-

[~chesnay] sure I can work on it. 

> Missing state in enum type in rest_api_v1.snapshot
> --
>
> Key: FLINK-19501
> URL: https://issues.apache.org/jira/browse/FLINK-19501
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Runtime / REST
>Affects Versions: 1.11.2
>Reporter: goutham
>Priority: Minor
>  Labels: pull-request-available
>
> INITIALIZING state is missing for one of enums api snapshot document 
>  
> "enum" : [ "INITIALIZING", "CREATED", "RUNNING", "FAILING", "FAILED", 
> "CANCELLING", "CANCELED", "FINISHED", "RESTARTING", "SUSPENDED", 
> "RECONCILING" ]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19514) ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange times out

2020-10-06 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19514:


 Summary: 
ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange times 
out
 Key: FLINK-19514
 URL: https://issues.apache.org/jira/browse/FLINK-19514
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Stephan Ewen


Full logs:
https://dev.azure.com/sewen0794/19b23adf-d190-4fb4-ae6e-2e92b08923a3/_apis/build/builds/148/logs/115

Exception:
{code}
[ERROR] 
testJobExecutionOnClusterWithLeaderChange(org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase)
  Time elapsed: 301.093 s  <<< ERROR!
java.util.concurrent.TimeoutException: Condition was not met in given timeout.
at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:132)
at 
org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase.getNextLeadingDispatcherGateway(ZooKeeperLeaderElectionITCase.java:140)
at 
org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange(ZooKeeperLeaderElectionITCase.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


flinkbot commented on pull request #13548:
URL: https://github.com/apache/flink/pull/13548#issuecomment-704328667


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3e996d9860a1b1f5431a2c3abee8f1c41b7bbd00 (Tue Oct 06 
14:58:11 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19441) Performance regression on 24.09.2020

2020-10-06 Thread Stephan Ewen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208828#comment-17208828
 ] 

Stephan Ewen commented on FLINK-19441:
--

 I could not find anything very suspicious in the code. In general, the code 
does look less complex not than before, does not have any new locks or volatile 
variable accesses, and seems to have the same number of virtual method calls:

  - RecordWriter.emit(T record)  (bi-morphic between 
{{ChannelSelectorRecordWriter}} and {{BroadcastRecordWriter}}.
  - The record serialization (to byte buffer) is unchanged
  - Because the record-to-buffer logic is in the {{ResultPartition}} now, there 
is one method on ResultPartition that is called per record now, rather than per 
buffer. That method is not virtual, though, there is only one active 
implementation.

One think we can investigate is the 
{{ConsumableNotifyingResultPartitionWriterDecorator}}. The class is always 
loaded, but never instantiated (only a static method is used). That means in 
CHA the JIT will find two methods for the last method mentioned above, but all 
call sites call the same one, so any profiling logic should be able to cut this 
out again.

Still, it is trivial to change the code so that the 
{{ConsumableNotifyingResultPartitionWriterDecorator}} code is never loaded.
Here is the PR for that: https://github.com/apache/flink/pull/13548

> Performance regression on 24.09.2020
> 
>
> Key: FLINK-19441
> URL: https://issues.apache.org/jira/browse/FLINK-19441
> Project: Flink
>  Issue Type: Bug
>Reporter: Arvid Heise
>Assignee: Stephan Ewen
>Priority: Blocker
>  Labels: pull-request-available
>
> A couple of benchmarks are showing a small performance regression on 
> 24.09.2020:
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> http://codespeed.dak8s.net:8000/timeline/?ben=tupleKeyBy=2 (?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] StephanEwen opened a new pull request #13548: [FLINK-19441][network] Avoid loading of ResultPartition wrapper class for consumable notifications when possible.

2020-10-06 Thread GitBox


StephanEwen opened a new pull request #13548:
URL: https://github.com/apache/flink/pull/13548


   ## What is the purpose of the change
   
   A small code optimization attempting to reduce the virtual method call 
overhead on the per-record methods in `ResultPartition`.
   
   It does that by making the class wrapping the `ResultPartition` (formerly 
`ConsumableNotifyingResultPartitionWriterDecorator`) a different class than the 
one with the factory method. That way, if the factory method never needs a 
wrapper class (under normal configuration, it never needs one) the wrapping 
class is never loaded, and the JIT can find out (through CHA) that only one 
implementation for the per-record methods on `ResultPartition` exists.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **no**
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **yes**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19441) Performance regression on 24.09.2020

2020-10-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19441:
---
Labels: pull-request-available  (was: )

> Performance regression on 24.09.2020
> 
>
> Key: FLINK-19441
> URL: https://issues.apache.org/jira/browse/FLINK-19441
> Project: Flink
>  Issue Type: Bug
>Reporter: Arvid Heise
>Assignee: Stephan Ewen
>Priority: Blocker
>  Labels: pull-request-available
>
> A couple of benchmarks are showing a small performance regression on 
> 24.09.2020:
> http://codespeed.dak8s.net:8000/timeline/?ben=globalWindow=2
> http://codespeed.dak8s.net:8000/timeline/?ben=tupleKeyBy=2 (?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13545: [FLINK-19486] Make "Unexpected State Handle" Exception more helpful

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13545:
URL: https://github.com/apache/flink/pull/13545#issuecomment-704186207


   
   ## CI report:
   
   * b66350a46c67dd801c8cb92ad0a33e2ad2e2d759 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7234)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13544:
URL: https://github.com/apache/flink/pull/13544#issuecomment-704178057


   
   ## CI report:
   
   * a66e85992ca9513c7e54c74f1f098f6773595f10 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7233)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


XComp commented on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704310910


   Another thing that's up for discussion: The current implementation does 
provide redundant information in the REST API.
   ```
   {
 "id": "192.168.178.34:58646-64d0d8",
 "path": "akka.tcp://flink@192.168.178.34:58646/user/rpc/taskmanager_0",
 "dataPort": 58648,
 "jmxPort": -1,
 "timeSinceLastHeartbeat": 1601994279165,
 "slotsNumber": 1,
 "freeSlots": 0,
 "totalResource": {
   "cpuCores": 0,
   "taskHeapMemory": 0,
   "taskOffHeapMemory": 0,
   "managedMemory": 0,
   "networkMemory": 0,
   "extendedResources": {}
 },
 "freeResource": {
   "cpuCores": 0,
   "taskHeapMemory": 0,
   "taskOffHeapMemory": 0,
   "managedMemory": 0,
   "networkMemory": 0,
   "extendedResources": {}
 },
 "hardware": {
   "cpuCores": 12,
   "physicalMemory": 17179869184,
   "freeMemory": 536805376,
   "managedMemory": 536870920  <-- redundant
 },
 "memoryConfiguration": {
   "frameworkHeap": 134217728,
   "taskHeap": 402653174,
   "frameworkOffHeap": 134217728,
   "taskOffHeap": 0,
   "networkMemory": 134217730,
   "managedMemory": 536870920,  <-- redundant
   "jvmMetaspace": 268435456,
   "jvmOverhead": 1073741824,
   "totalFlinkMemory": null,
   "totalProcessMemory": 1811939328
 },
 "metrics": {
   "heapUsed": 119710848,
   "heapCommitted": 536805376,
   "heapMax": 536805376,
   "nonHeapUsed": 70984004,
   "nonHeapCommitted": 455180084,
   "nonHeapMax": -1,
   "directCount": 4115,
   "directUsed": 134750413,
   "directMax": 134750412,
   "mappedCount": 0,
   "mappedUsed": 0,
   "mappedMax": 0,
   "memorySegmentsAvailable": 4092,
   "memorySegmentsTotal": 4096,
   "nettyShuffleMemorySegmentsAvailable": 4092,
   "nettyShuffleMemorySegmentsUsed": 4,
   "nettyShuffleMemorySegmentsTotal": 4096,
   "nettyShuffleMemoryAvailable": 134086656,
   "nettyShuffleMemoryUsed": 131072,
   "nettyShuffleMemoryTotal": 134217728,
   "managedMemoryAvailable": 536870920,
   "managedMemoryUsed": 0,
   "managedMemoryTotal": 536870920, <-- redundant
   "garbageCollectors": [
 {
   "name": "scavenge",
   "count": 2,
   "time": 21
 },
 {
   "name": "global",
   "count": 0,
   "time": 0
 }
   ]
 }
   }
   ```
   
   The example above shows that `metrics.managedMemoryTotal`, 
`memoryConfiguration.managedMemory`, and `hardware.managedMemory` are showing 
the same value. At least, `hardware.managedMemory` seems to be out of place. 
IMHO, we could remove it as part of this PR. Not sure about the other two, 
though, since it's kind of standardized.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] AHeise commented on a change in pull request #13545: [FLINK-19486] Make "Unexpected State Handle" Exception more helpful

2020-10-06 Thread GitBox


AHeise commented on a change in pull request #13545:
URL: https://github.com/apache/flink/pull/13545#discussion_r500325899



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/UnexpectedStateHandleException.java
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Joiner;
+
+/**
+ * Signals that an operation did not get the type of {@link StateObject} that 
was expected. This can
+ * mostly happen when a different {@link StateBackend} from the one that was 
used for taking a
+ * checkpoint/savepoint is used when restoring.
+ */
+public class UnexpectedStateHandleException extends IllegalStateException {

Review comment:
   If we could find a good place for such a method, I'd prefer a method. 
But if not (like adding a new Util class), leave as is. The only place that I 
could think of is to add a static method to `RestoreOperation`.

##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/state/UnexpectedStateHandleExceptionTest.java
##
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.junit.Test;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.junit.Assert.assertThat;
+
+/**
+ * Tests for {@link UnexpectedStateHandleException}.
+ */
+public class UnexpectedStateHandleExceptionTest {

Review comment:
   Hm, I'm not sure what the value of the current tests are. If I add a 
linebreak to the message formatting at some point, I'd need to touch both 
tests. 
   
   But I also don't insist on the ITCase: Since you didn't change anything on 
the exception logic (just using a subclass), I'm fine to leave as is.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13529: [FLINK-19473] Implement multi inputs sorting DataInput

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13529:
URL: https://github.com/apache/flink/pull/13529#issuecomment-702193043


   
   ## CI report:
   
   * eb1640d14d984365e27d9c0a58bbb2d9e4de6d03 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7231)
 
   * d91430e1c66451806b07692731574192c09eca89 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7237)
 
   * 055e55a0e4e55637a5896a173939e00e6991abe4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7240)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #13544: [FLINK-19309][coordination] Add TaskExecutorManager

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #13544:
URL: https://github.com/apache/flink/pull/13544#discussion_r500299558



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java
##
@@ -0,0 +1,440 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.resourcemanager.slotmanager;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.time.Time;
+import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
+import org.apache.flink.runtime.clusterframework.types.SlotID;
+import org.apache.flink.runtime.concurrent.ScheduledExecutor;
+import org.apache.flink.runtime.instance.InstanceID;
+import org.apache.flink.runtime.resourcemanager.WorkerResourceSpec;
+import 
org.apache.flink.runtime.resourcemanager.registration.TaskExecutorConnection;
+import org.apache.flink.runtime.slots.ResourceRequirement;
+import org.apache.flink.runtime.taskexecutor.SlotReport;
+import org.apache.flink.runtime.taskexecutor.SlotStatus;
+import org.apache.flink.util.FlinkException;
+import org.apache.flink.util.MathUtils;
+import org.apache.flink.util.Preconditions;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.Executor;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+
+/**
+ * SlotManager component for all task executor related things.
+ *
+ * Dev note: This component only exists to keep the code out of the slot 
manager.
+ * It covers many aspects that aren't really the responsibility of the slot 
manager, and should be refactored to live
+ * outside the slot manager and split into multiple parts.
+ */
+class TaskExecutorManager implements AutoCloseable {
+   private static final Logger LOG = 
LoggerFactory.getLogger(TaskExecutorManager.class);
+
+   private final ResourceProfile defaultSlotResourceProfile;
+
+   /** The default resource spec of workers to request. */
+   private final WorkerResourceSpec defaultWorkerResourceSpec;
+
+   private final int numSlotsPerWorker;
+
+   /** Defines the max limitation of the total number of slots. */
+   private final int maxSlotNum;
+
+   /** Release task executor only when each produced result partition is 
either consumed or failed. */
+   private final boolean waitResultConsumedBeforeRelease;
+
+   /** Defines the number of redundant taskmanagers. */
+   private final int redundantTaskManagerNum;
+
+   /** Timeout after which an unused TaskManager is released. */
+   private final Time taskManagerTimeout;
+
+   /** Callbacks for resource (de-)allocations. */
+   private final ResourceActions resourceActions;
+
+   /** All currently registered task managers. */
+   private final Map 
taskManagerRegistrations = new HashMap<>();
+
+   private final Map 
pendingSlots = new HashMap<>();

Review comment:
   I don't fully understand why the TEM is responsible for the pending 
slots. This looks more like the responsibility of the `SlotManager` to me.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java
##
@@ -0,0 +1,440 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ 

[jira] [Closed] (FLINK-19308) Add SlotTracker

2020-10-06 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-19308.

Resolution: Fixed

master: d196b396c685e0c7de207b283acf34de078fdf43

> Add SlotTracker
> ---
>
> Key: FLINK-19308
> URL: https://issues.apache.org/jira/browse/FLINK-19308
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Move the slot bookkeeping into a separate data-structure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol merged pull request #13508: [FLINK-19308][coordination] Add SlotTracker

2020-10-06 Thread GitBox


zentol merged pull request #13508:
URL: https://github.com/apache/flink/pull/13508


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

2020-10-06 Thread GitBox


zentol merged pull request #13464:
URL: https://github.com/apache/flink/pull/13464


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19307) Add ResourceTracker

2020-10-06 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-19307:
-
Summary: Add ResourceTracker  (was: Add RequirementsTracker)

> Add ResourceTracker
> ---
>
> Key: FLINK-19307
> URL: https://issues.apache.org/jira/browse/FLINK-19307
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Add a component that tracks the requirements for a job and missing/acquired 
> resources.
> Data is kept up to date via notifications about new/update resource 
> requirements and acquired/lost resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-19307) Add RequirementsTracker

2020-10-06 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-19307.

Resolution: Fixed

master: 7f07b62723a1b49ab5403b3b2797cb87686fbbd6

> Add RequirementsTracker
> ---
>
> Key: FLINK-19307
> URL: https://issues.apache.org/jira/browse/FLINK-19307
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Add a component that tracks the requirements for a job and missing/acquired 
> resources.
> Data is kept up to date via notifications about new/update resource 
> requirements and acquired/lost resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-18830) JoinCoGroupFunction and FlatJoinCoGroupFunction work incorrectly for outer join when one side of coGroup is empty

2020-10-06 Thread liupengcheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208780#comment-17208780
 ] 

liupengcheng commented on FLINK-18830:
--

[~aljoscha] [~jark] Thanks for clarification, yes, I think we can make it the 
internal implementation then.

> JoinCoGroupFunction and FlatJoinCoGroupFunction work incorrectly for outer 
> join when one side of coGroup is empty
> -
>
> Key: FLINK-18830
> URL: https://issues.apache.org/jira/browse/FLINK-18830
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.11.1
>Reporter: liupengcheng
>Priority: Major
>
> Currently, The {{JoinCoGroupFunction}} and {{FlatJoinCoGroupFunction}} in 
> JoinedStreams doesn't respect the join type, it's been implemented as doing 
> join within a two-level loop. However, this is incorrect for outer join when 
> one side of the coGroup is empty.
> {code}
>   public void coGroup(Iterable first, Iterable second, 
> Collector out) throws Exception {
>   for (T1 val1: first) {
>   for (T2 val2: second) {
>   wrappedFunction.join(val1, val2, out);
>   }
>   }
>   }
> {code}
> The above code is the current implementation, suppose the first input is 
> non-empty, and the second input is an empty iterator, then the join 
> function(`wrappedFunction`) will never be called. This will cause no data to 
> be emitted for a left outer join.
> So I propose to consider join type here, and handle this case, e.g., for left 
> outer join, we can emit record with right side set to null here if the right 
> side is empty or can not find any match in the right side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13529: [FLINK-19473] Implement multi inputs sorting DataInput

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13529:
URL: https://github.com/apache/flink/pull/13529#issuecomment-702193043


   
   ## CI report:
   
   * eb1640d14d984365e27d9c0a58bbb2d9e4de6d03 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7231)
 
   * d91430e1c66451806b07692731574192c09eca89 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7237)
 
   * 055e55a0e4e55637a5896a173939e00e6991abe4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


flinkbot commented on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704265372


   
   ## CI report:
   
   * 20c70c0fec1ceb77e60e138bb334bb48f5bcadcf UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704265372


   
   ## CI report:
   
   * 20c70c0fec1ceb77e60e138bb334bb48f5bcadcf Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7238)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13508: [FLINK-19308][coordination] Add SlotTracker

2020-10-06 Thread GitBox


flinkbot edited a comment on pull request #13508:
URL: https://github.com/apache/flink/pull/13508#issuecomment-700676990


   
   ## CI report:
   
   * 7bad74a7405f33d6ed8fd9ba7e9741be89406658 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7229)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-19027) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel failed because of test timeout

2020-10-06 Thread Arvid Heise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise closed FLINK-19027.
---
Fix Version/s: (was: 1.11.3)
   Resolution: Fixed

> UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel
>  failed because of test timeout
> 
>
> Key: FLINK-19027
> URL: https://issues.apache.org/jira/browse/FLINK-19027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.12.0, 1.11.2
>Reporter: Dian Fu
>Assignee: Arvid Heise
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5789=logs=119bbba7-f5e3-5e08-e72d-09f1529665de=ec103906-d047-5b8a-680e-05fc000dfca9]
> {code}
> 2020-08-22T21:13:05.5315459Z [ERROR] 
> shouldPerformUnalignedCheckpointOnParallelRemoteChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)
>   Time elapsed: 300.075 s  <<< ERROR!
> 2020-08-22T21:13:05.5316451Z org.junit.runners.model.TestTimedOutException: 
> test timed out after 300 seconds
> 2020-08-22T21:13:05.5317432Z  at sun.misc.Unsafe.park(Native Method)
> 2020-08-22T21:13:05.5317799Z  at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 2020-08-22T21:13:05.5318247Z  at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> 2020-08-22T21:13:05.5318885Z  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> 2020-08-22T21:13:05.5327035Z  at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> 2020-08-22T21:13:05.5328114Z  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2020-08-22T21:13:05.5328869Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1719)
> 2020-08-22T21:13:05.5329482Z  at 
> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
> 2020-08-22T21:13:05.5330138Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1699)
> 2020-08-22T21:13:05.5330771Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1681)
> 2020-08-22T21:13:05.5331351Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:158)
> 2020-08-22T21:13:05.5332015Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel(UnalignedCheckpointITCase.java:140)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19027) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel failed because of test timeout

2020-10-06 Thread Arvid Heise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise updated FLINK-19027:

Affects Version/s: (was: 1.11.2)

> UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel
>  failed because of test timeout
> 
>
> Key: FLINK-19027
> URL: https://issues.apache.org/jira/browse/FLINK-19027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.12.0
>Reporter: Dian Fu
>Assignee: Arvid Heise
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5789=logs=119bbba7-f5e3-5e08-e72d-09f1529665de=ec103906-d047-5b8a-680e-05fc000dfca9]
> {code}
> 2020-08-22T21:13:05.5315459Z [ERROR] 
> shouldPerformUnalignedCheckpointOnParallelRemoteChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)
>   Time elapsed: 300.075 s  <<< ERROR!
> 2020-08-22T21:13:05.5316451Z org.junit.runners.model.TestTimedOutException: 
> test timed out after 300 seconds
> 2020-08-22T21:13:05.5317432Z  at sun.misc.Unsafe.park(Native Method)
> 2020-08-22T21:13:05.5317799Z  at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 2020-08-22T21:13:05.5318247Z  at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> 2020-08-22T21:13:05.5318885Z  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> 2020-08-22T21:13:05.5327035Z  at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> 2020-08-22T21:13:05.5328114Z  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2020-08-22T21:13:05.5328869Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1719)
> 2020-08-22T21:13:05.5329482Z  at 
> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
> 2020-08-22T21:13:05.5330138Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1699)
> 2020-08-22T21:13:05.5330771Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1681)
> 2020-08-22T21:13:05.5331351Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:158)
> 2020-08-22T21:13:05.5332015Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel(UnalignedCheckpointITCase.java:140)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19027) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel failed because of test timeout

2020-10-06 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208735#comment-17208735
 ] 

Arvid Heise commented on FLINK-19027:
-

I don't see any failures on release-1.11, probably because it's using the tight 
memory management that master has for the test. Closing.

> UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel
>  failed because of test timeout
> 
>
> Key: FLINK-19027
> URL: https://issues.apache.org/jira/browse/FLINK-19027
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.12.0, 1.11.2
>Reporter: Dian Fu
>Assignee: Arvid Heise
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0, 1.11.3
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5789=logs=119bbba7-f5e3-5e08-e72d-09f1529665de=ec103906-d047-5b8a-680e-05fc000dfca9]
> {code}
> 2020-08-22T21:13:05.5315459Z [ERROR] 
> shouldPerformUnalignedCheckpointOnParallelRemoteChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)
>   Time elapsed: 300.075 s  <<< ERROR!
> 2020-08-22T21:13:05.5316451Z org.junit.runners.model.TestTimedOutException: 
> test timed out after 300 seconds
> 2020-08-22T21:13:05.5317432Z  at sun.misc.Unsafe.park(Native Method)
> 2020-08-22T21:13:05.5317799Z  at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 2020-08-22T21:13:05.5318247Z  at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> 2020-08-22T21:13:05.5318885Z  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> 2020-08-22T21:13:05.5327035Z  at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> 2020-08-22T21:13:05.5328114Z  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2020-08-22T21:13:05.5328869Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1719)
> 2020-08-22T21:13:05.5329482Z  at 
> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
> 2020-08-22T21:13:05.5330138Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1699)
> 2020-08-22T21:13:05.5330771Z  at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1681)
> 2020-08-22T21:13:05.5331351Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:158)
> 2020-08-22T21:13:05.5332015Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel(UnalignedCheckpointITCase.java:140)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] kezhuw commented on a change in pull request #13124: [FLINK-18815][test-stability] Close safety net guarded closeable iff it is still registered

2020-10-06 Thread GitBox


kezhuw commented on a change in pull request #13124:
URL: https://github.com/apache/flink/pull/13124#discussion_r500268405



##
File path: 
flink-core/src/main/java/org/apache/flink/util/AbstractCloseableRegistry.java
##
@@ -163,9 +163,9 @@ protected final void addCloseableInternal(Closeable 
closeable, T metaData) {
/**
 * Removes a mapping from the registry map, respecting locking.
 */
-   protected final void removeCloseableInternal(Closeable closeable) {
+   protected final boolean removeCloseableInternal(Closeable closeable, T 
object) {
synchronized (getSynchronizationLock()) {
-   closeableToRef.remove(closeable);
+   return closeableToRef.remove(closeable, object);

Review comment:
   @kezhuw For self-review. The paranoid concern does not hold, since java 
has only one approach to recycle unreachable object, it is overriding 
`Object.finalize`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19489) SplitFetcherTest.testNotifiesWhenGoingIdleConcurrent gets stuck

2020-10-06 Thread Kezhu Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208723#comment-17208723
 ] 

Kezhu Wang commented on FLINK-19489:


Here are my imaginative execution flows for this issue, please point it out if 
there is something wrong.

 1. In testing thread, inside {{fetcher.runOnce}}, after putting one element to 
empty {{FutureCompletingBlockingQueue}}, the queue is available for pop.
 2. In drainer thread, {{QueueDrainerThread.go}} pops that element from the 
queue, and resets {{FutureCompletingBlockingQueue.currentFuture}} to new empty 
future.
 3. In drainer thread, {{QueueDrainerThread.go}} runs into next iteration, 
since {{FutureCompletingBlockingQueue.currentFuture}} is empty, it enters 
{{CompletableFuture.waitingGet}}.
 4. In testing thread, inside {{fetcher.runOnce}}, split fetcher calls 
{{FutureCompletingBlockingQueue.notifyAvailable}} due to idleness checking. 
{{FutureCompletingBlockingQueue.notifyAvailable}} will complete future step-3 
waiting on. {{QueueDrainerThread}} is unparked, but does not get a chance to 
run.
 5. In testing thread, {{assertTrue(queue.getAvailabilityFuture().isDone())}} 
passes.
 6. In testing thread, {{QueueDrainerThread.shutdown}} interrupts 
{{QueueDrainerThread}}.
 7. In drainer thread, {{QueueDrainerThread.waitingGet.q.block}} gets its 
chance to run, and finds that it is interrupted. But that interrupt status will 
lose due to {{ASSIGNMENT}} statement in my previous comment.
 8. In drainer thread, {{QueueDrainerThread.go}}, 
{{FutureCompletingBlockingQueue.poll}} finds no element and resets 
{{FutureCompletingBlockingQueue.currentFuture}} to new empty future. 
{{FutureCompletingBlockingQueue.take}} falls into {{CompletableFuture.get}}. 
Since, interrupt status lost in step-7, {{CompletableFuture.get}} will block 
forever.
 9. In testing method, {{QueueDrainerThread.shutdown}} blocks forever in 
joining on will-not-terminated drainer thread.
 10. {{SplitFetcherTest.testNotifiesWhenGoingIdleConcurrent}} gets stuck.

If my analysis in correct, 
{{SplitFetcherTest.testNotifiesOlderFutureWhenGoingIdleConcurrent}} should face 
this issue too, these two cases are almost same except assertion statement. But 
I didn't find more failed cases in [pipeline 
report|https://dev.azure.com/apache-flink/apache-flink/_pipeline/analytics/stageawareoutcome?definitionId=1].

Besides this, I have created two online repls 
([openjdk8|https://repl.it/@kezhuw/openjdk8-completablefuture-interruptedexception#Main.java]
 and 
[openjdk11|https://repl.it/@kezhuw/openjdk11-completablefuture-interruptedexception#Main.java])
 for easy evaluation. The two repls have identical code. The openjdk8 version 
probably will print {{Future get thread lost interrupt status after 
future.get}} if there is no {{Thread.sleep}} between 
{{futureGetThread.interrupt()}} and {{future.complete(null))}} while openjdk11 
probably will print {{Future get thread lost interrupt status after 
future.get}} and exit with error code 1. I encountered cases that openjdk11 
printed {{Future get thread lost interrupt status after future.get}}, I think 
it is caused by spuriously wake up from {{LockSupport.park}} before 
{{futureGetThread.interrupt}}.

Again, please point it out if there is something wrong. Anyway, glad to hear 
feedbacks.

> SplitFetcherTest.testNotifiesWhenGoingIdleConcurrent gets stuck
> ---
>
> Key: FLINK-19489
> URL: https://issues.apache.org/jira/browse/FLINK-19489
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Dian Fu
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7158=logs=298e20ef-7951-5965-0e79-ea664ddc435e=b4cd3436-dbe8-556d-3bca-42f92c3cbf2f
> {code}
> 020-10-01T21:55:34.9982203Z "main" #1 prio=5 os_prio=0 cpu=1048.80ms 
> elapsed=921.99s tid=0x7f8c00015800 nid=0xf6e in Object.wait()  
> [0x7f8c06648000]
> 2020-10-01T21:55:34.9982807Zjava.lang.Thread.State: WAITING (on object 
> monitor)
> 2020-10-01T21:55:34.9983177Z  at 
> java.lang.Object.wait(java.base@11.0.7/Native Method)
> 2020-10-01T21:55:34.9983871Z  - waiting on <0x8e0be190> (a 
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherTest$QueueDrainerThread)
> 2020-10-01T21:55:34.9984581Z  at 
> java.lang.Thread.join(java.base@11.0.7/Thread.java:1305)
> 2020-10-01T21:55:34.9985433Z  - waiting to re-lock in wait() 
> <0x8e0be190> (a 
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherTest$QueueDrainerThread)
> 2020-10-01T21:55:34.9985998Z  at 
> org.apache.flink.core.testutils.CheckedThread.trySync(CheckedThread.java:112)
> 2020-10-01T21:55:34.9986511Z  at 
> 

[GitHub] [flink] tillrohrmann commented on a change in pull request #13508: [FLINK-19308][coordination] Add SlotTracker

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #13508:
URL: https://github.com/apache/flink/pull/13508#discussion_r500285786



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/dispatcher/DispatcherResourceCleanupTest.java
##
@@ -343,6 +344,7 @@ public void testRunningJobsRegistryCleanup() throws 
Exception {
 * before a new job with the same {@link JobID} is started.
 */
@Test
+   @Ignore

Review comment:
   Thanks for the pointer.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-19472) Implement one input sorting DataInput

2020-10-06 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-19472.

Resolution: Fixed

Implemented via:
* bd1e1f306d9e8e94421a4a0587182a2e0283adf0 ... 
fc0e28d14f4e51ff87997592d9bb220d6e7197eb

> Implement one input sorting DataInput
> -
>
> Key: FLINK-19472
> URL: https://issues.apache.org/jira/browse/FLINK-19472
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] XComp commented on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


XComp commented on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704258348


   @azagrebin Feel free to review the changes.
   
   I am not 100% happy with the naming of the metrics (specifically, 
`Status.ManagedMemory.Used` vs `Status.ManagedMemory.UsedMemory` vs 
`Status.Flink.ManagedMemory.Used` vs ...?). But I wanted to give it a try to 
start a discussion on it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #13311: [FLINK-18721] Migrate YarnResourceManager to the new YarnResourceManagerDriver

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #13311:
URL: https://github.com/apache/flink/pull/13311#discussion_r500102360



##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnResourceManagerConfiguration.java
##
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.yarn.configuration;
+
+import org.apache.flink.util.Preconditions;
+import org.apache.flink.yarn.YarnConfigKeys;
+
+import org.apache.hadoop.yarn.api.ApplicationConstants;
+
+import javax.annotation.Nullable;
+
+import java.util.Map;
+
+/**
+ * Configuration specific to {@link org.apache.flink.yarn.YarnResourceManager}.
+ */
+public class YarnResourceManagerConfiguration {
+   @Nullable private final String webInterfaceUrl;
+   private final String rpcAddress;
+   private final String yarnFiles;
+   private final String flinkClasspath;
+   private final String clientShipFiles;
+   private final String flinkDistJar;
+   private final String currentDir;
+   @Nullable private final String remoteKeytabPath;
+   @Nullable private final String localKeytabPath;
+   @Nullable private final String keytabPrinciple;
+   @Nullable private final String krb5Path;
+   @Nullable private final String yarnSiteXMLPath;
+
+   public YarnResourceManagerConfiguration(
+   Map env,
+   @Nullable String webInterfaceUrl,
+   String rpcAddress) {

Review comment:
   nit: I would change the order of these two parameters.

##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnNodeManagerClientFactory.java
##
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.yarn;
+
+import org.apache.hadoop.yarn.client.api.async.NMClientAsync;
+
+/**
+ * Factory interface for {@link NMClientAsync}.
+ */
+public interface YarnNodeManagerClientFactory {
+   NMClientAsync createNodeManagerClient(NMClientAsync.CallbackHandler 
callbackHandler);

Review comment:
   nit: JavaDocs are missing

##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java
##
@@ -0,0 +1,609 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.yarn;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.GlobalConfiguration;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.configuration.TaskManagerOptionsInternal;
+import 

[GitHub] [flink] aljoscha commented on a change in pull request #13545: [FLINK-19486] Make "Unexpected State Handle" Exception more helpful

2020-10-06 Thread GitBox


aljoscha commented on a change in pull request #13545:
URL: https://github.com/apache/flink/pull/13545#discussion_r500277349



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/UnexpectedStateHandleException.java
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Joiner;
+
+/**
+ * Signals that an operation did not get the type of {@link StateObject} that 
was expected. This can
+ * mostly happen when a different {@link StateBackend} from the one that was 
used for taking a
+ * checkpoint/savepoint is used when restoring.
+ */
+public class UnexpectedStateHandleException extends IllegalStateException {

Review comment:
   I'd be happy to change it to a small method if you think it's more 
maintainable. I'l also not sure I like the additional exception.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] dawidwys closed pull request #13521: [FLINK-19472] Implement a one input sorting DataInput

2020-10-06 Thread GitBox


dawidwys closed pull request #13521:
URL: https://github.com/apache/flink/pull/13521


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


flinkbot commented on pull request #13547:
URL: https://github.com/apache/flink/pull/13547#issuecomment-704258387


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 392ed7a9626c675ec9f0a69ae85030004d98aef9 (Tue Oct 06 
13:11:01 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] aljoscha commented on a change in pull request #13545: [FLINK-19486] Make "Unexpected State Handle" Exception more helpful

2020-10-06 Thread GitBox


aljoscha commented on a change in pull request #13545:
URL: https://github.com/apache/flink/pull/13545#discussion_r500276832



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/state/UnexpectedStateHandleExceptionTest.java
##
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.junit.Test;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.junit.Assert.assertThat;
+
+/**
+ * Tests for {@link UnexpectedStateHandleException}.
+ */
+public class UnexpectedStateHandleExceptionTest {

Review comment:
   I added these tests to check whether the somewhat longer string actually 
renders correctly. And I did find a formatting problem. 
   
   I don't want to add a more complicated ITCase now but someone could 
definitely do it. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] aljoscha commented on a change in pull request #13545: [FLINK-19486] Make "Unexpected State Handle" Exception more helpful

2020-10-06 Thread GitBox


aljoscha commented on a change in pull request #13545:
URL: https://github.com/apache/flink/pull/13545#discussion_r500276055



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/UnexpectedStateHandleException.java
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Joiner;
+
+/**
+ * Signals that an operation did not get the type of {@link StateObject} that 
was expected. This can
+ * mostly happen when a different {@link StateBackend} from the one that was 
used for taking a
+ * checkpoint/savepoint is used when restoring.
+ */
+public class UnexpectedStateHandleException extends IllegalStateException {
+
+   @SuppressWarnings("unchecked")
+   public UnexpectedStateHandleException(
+   Class expectedStateHandleClass,
+   Class actualStateHandleClass) {
+   this(new Class[] {expectedStateHandleClass}, 
actualStateHandleClass);
+   }
+
+   public UnexpectedStateHandleException(
+   Class[] 
expectedStateHandleClasses,
+   Class actualStateHandleClass) {

Review comment:
   I thought about that as well but decided to go with the order that JUnit 
`assertEquals()` uses. What do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp opened a new pull request #13547: [FLINK-14406][runtime] Exposes managed memory usage through the REST API

2020-10-06 Thread GitBox


XComp opened a new pull request #13547:
URL: https://github.com/apache/flink/pull/13547


   ## What is the purpose of the change
   
   This change is meant to expose the managed memory usage through the REST API 
to make it available in the web UI.
   
   
   ## Brief change log
   
   - Added new metrics for used and total managed memory
   - The used managed memory is determined by accessing the MemoryManagers of 
all active slots of the TaskSlotTable
   - Added new metrics to TaskManagerDetailsHandler
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   - the TaskManagerDetailsHandlerTest was extended accordingly
   - A test was added for retrieving all active slots in TaskSlotTableImplTest
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? docs
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tillrohrmann commented on a change in pull request #13540: [FLINK-19344] Fix DispatcherResourceCleanupTest race condition

2020-10-06 Thread GitBox


tillrohrmann commented on a change in pull request #13540:
URL: https://github.com/apache/flink/pull/13540#discussion_r500275816



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/dispatcher/DispatcherResourceCleanupTest.java
##
@@ -350,6 +350,10 @@ public void testJobSubmissionUnderSameJobId() throws 
Exception {
final TestingJobManagerRunner testingJobManagerRunner = 
jobManagerRunnerFactory.takeCreatedJobManagerRunner();
testingJobManagerRunner.completeResultFutureExceptionally(new 
JobNotFinishedException(jobId));
 
+   // wait until termination JobManagerRunner closeAsync has been 
called.
+   // this is necessary to avoid race conditions with completion 
of the 1st job and the submission of the 2nd job 
(DuplicateJobSubmissionException).
+   testingJobManagerRunner.getCloseAsyncCalledFuture().get();
+

Review comment:
   Thanks for the clarification @rmetzger. I believe that your analysis is 
correct.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19513) Update docs/build_docs.sh to support the new parameters for regenerating the docs

2020-10-06 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-19513:
-
Description: 
I came across Jekyll's regeneration feature not working anymore. After doing 
some research I found [this 
thread|https://github.com/jekyll/jekyll/issues/2926] stating that 
{{--force_polling}} should work. A test confirmed that the regeneration takes 
place with this parameter enabled.

The {{--watch}} parameter seem to be obsolete.

  was:
I came across Jekyll's regeneration feature not working anymore. After doing 
some research I found [this 
thread|https://github.com/jekyll/jekyll/issues/2926] stating that 
`--force_polling` should work. A test confirmed that the regeneration takes 
place with this parameter enabled.

The `--watch` parameter seem to be obsolete.


> Update docs/build_docs.sh to support the new parameters for regenerating the 
> docs
> -
>
> Key: FLINK-19513
> URL: https://issues.apache.org/jira/browse/FLINK-19513
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Matthias
>Priority: Minor
>  Labels: starter
>
> I came across Jekyll's regeneration feature not working anymore. After doing 
> some research I found [this 
> thread|https://github.com/jekyll/jekyll/issues/2926] stating that 
> {{--force_polling}} should work. A test confirmed that the regeneration takes 
> place with this parameter enabled.
> The {{--watch}} parameter seem to be obsolete.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >