date:20200225

[jira] [Comment Edited] (FLINK-16286) Support select from nothing

2020-02-25 Thread Dawid Wysakowicz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045223#comment-17045223
 ] 

Dawid Wysakowicz edited comment on FLINK-16286 at 2/26/20 7:57 AM:
---

Yes, I have the same question. This should be supported. I checked with sql 
client on master(not the newest though) and a query like {{select 1, 'ABC';}} 
just works.

[~zjffdu] Did you try it and had problems with it? 


was (Author: dawidwys):
Yes, I have the same question. This should be supported. I checked with sql 
client on master(not the newest though) and a query like {{select 1, 'ABC';}} 
just works.

Did you try it and had problems with it? 

> Support select from nothing
> ---
>
> Key: FLINK-16286
> URL: https://issues.apache.org/jira/browse/FLINK-16286
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Jeff Zhang
>Priority: Major
>
> It would be nice to support from noting in flink sql. e.g. 
> {code:java}
> select "name", 1+1, Date() {code}
> This is especially useful when user want to try udf provided by others 
> without creating new table for faked data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module

2020-02-25 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai updated FLINK-16272:

Fix Version/s: statefun-1.1

> Move Stateful Functions end-to-end test framework classes to common module
> --
>
> Key: FLINK-16272
> URL: https://issues.apache.org/jira/browse/FLINK-16272
> Project: Flink
>  Issue Type: Task
>  Components: Stateful Functions, Test Infrastructure
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-1.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are some end-to-end test utilities that other end-to-end Stateful 
> Function tests can benefit from, such as:
> * {{StatefulFunctionsAppContainers}}
> * {{KafkaIOVerifier}}
> * {{KafkaProtobufSerDe}}
> We should move this to a new common module under 
> {{statefun-end-to-end-tests}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module

2020-02-25 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-16272.
---
Resolution: Fixed

Merged to master via 78117b8d96c0ad9b27bf971d3a472c750d84a1d6

> Move Stateful Functions end-to-end test framework classes to common module
> --
>
> Key: FLINK-16272
> URL: https://issues.apache.org/jira/browse/FLINK-16272
> Project: Flink
>  Issue Type: Task
>  Components: Stateful Functions, Test Infrastructure
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are some end-to-end test utilities that other end-to-end Stateful 
> Function tests can benefit from, such as:
> * {{StatefulFunctionsAppContainers}}
> * {{KafkaIOVerifier}}
> * {{KafkaProtobufSerDe}}
> We should move this to a new common module under 
> {{statefun-end-to-end-tests}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16286) Support select from nothing

2020-02-25 Thread Dawid Wysakowicz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045223#comment-17045223
 ] 

Dawid Wysakowicz commented on FLINK-16286:
--

Yes, I have the same question. This should be supported. I checked with sql 
client on master(not the newest though) and a query like {{select 1, 'ABC';}} 
just works.

Did you try it and had problems with it? 

> Support select from nothing
> ---
>
> Key: FLINK-16286
> URL: https://issues.apache.org/jira/browse/FLINK-16286
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Jeff Zhang
>Priority: Major
>
> It would be nice to support from noting in flink sql. e.g. 
> {code:java}
> select "name", 1+1, Date() {code}
> This is especially useful when user want to try udf provided by others 
> without creating new table for faked data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16244) Add Asynchronous operations to state lazily.

2020-02-25 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-16244.
---
Fix Version/s: statefun-1.1
   Resolution: Fixed

Merged to master via f367b40e9096e89a7afe04c93b067bb7b3f25a1e

> Add Asynchronous operations to state lazily.
> 
>
> Key: FLINK-16244
> URL: https://issues.apache.org/jira/browse/FLINK-16244
> Project: Flink
>  Issue Type: Task
>  Components: Stateful Functions
>Reporter: Igal Shilman
>Assignee: Igal Shilman
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-1.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently AsyncSink would add eagerly a registered async operation. 
> An alternative approach would be to keep the async operations in an in memory 
> map and only 
> write them to the underlying map on snapshotState().
> The rational behind this approach is the assumption that most async 
> operations complete between two consecutive checkpoints, and therefore adding 
> and removing them from the underlying state backend (rocksdb by default) is 
> wasteful.
> An implementation outline suggestion:
> 1. Add a LazyAsyncOperations class that keeps both an in memory map
> and a MapStateHandle
> this map would support add(), remove() and also flush()
> 2. Use that class in AsyncSink and in AsyncMessageDecorator
> 3. call flush() from FunctionGroupOperator#snapshotState()
> note that a special care should be taken in flash() as the current key needs 
> to be set
> in the keyedStateBackend.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-8093) flink job fail because of kafka producer create fail of "javax.management.InstanceAlreadyExistsException"

2020-02-25 Thread Nico Kruber (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045220#comment-17045220
 ] 

Nico Kruber commented on FLINK-8093:


Sorry [~bdine] for not seeing your message. I assigned this ticket to you if 
you are still up for it; please give me a small note whether you still want to 
work on this or not.

> flink job fail because of kafka producer create fail of 
> "javax.management.InstanceAlreadyExistsException"
> -
>
> Key: FLINK-8093
> URL: https://issues.apache.org/jira/browse/FLINK-8093
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.3.2
> Environment: flink 1.3.2, kafka 0.9.1
>Reporter: dongtingting
>Assignee: Bastien DINE
>Priority: Critical
>
> one taskmanager has multiple taskslot, one task fail because of create 
> kafkaProducer fail，the reason for create kafkaProducer fail is 
> “javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=producer-metrics,client-id=producer-3”。 the detail trace 
> is ：
> 2017-11-04 19:41:23,281 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom Source -> Filter -> Map -> Filter -> Sink: 
> dp_client_**_log (7/80) (99551f3f892232d7df5eb9060fa9940c) switched from 
> RUNNING to FAILED.
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:181)
> at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.getKafkaProducer(FlinkKafkaProducerBase.java:202)
> at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.open(FlinkKafkaProducerBase.java:212)
> at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:375)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.producer:type=producer-metrics,client-id=producer-3
> at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
> at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
> at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
> at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:255)
> at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:239)
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.registerMetrics(RecordAccumulator.java:137)
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.(RecordAccumulator.java:111)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:261)
> ... 9 more
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=producer-metrics,client-id=producer-3
> at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
> ... 16 more
> I doubt that task in different taskslot of one taskmanager use different 
> classloader， and taskid may be  the same in one process。 So this lead to 
> create kafkaProducer fail in one taskManager。 
> Does anybody encountered the same problem？ 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-statefun] tzulitai closed pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module

2020-02-25 Thread GitBox

tzulitai closed pull request #34: [FLINK-16272] Move e2e test utilities to a 
statefun-e2e-tests-common module
URL: https://github.com/apache/flink-statefun/pull/34
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai closed pull request #33: [FLINK-16244] Flush async operations lazily

2020-02-25 Thread GitBox

tzulitai closed pull request #33: [FLINK-16244] Flush async operations lazily 
URL: https://github.com/apache/flink-statefun/pull/33
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Assigned] (FLINK-8093) flink job fail because of kafka producer create fail of "javax.management.InstanceAlreadyExistsException"

2020-02-25 Thread Nico Kruber (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber reassigned FLINK-8093:
--

Assignee: Bastien DINE

> flink job fail because of kafka producer create fail of 
> "javax.management.InstanceAlreadyExistsException"
> -
>
> Key: FLINK-8093
> URL: https://issues.apache.org/jira/browse/FLINK-8093
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.3.2
> Environment: flink 1.3.2, kafka 0.9.1
>Reporter: dongtingting
>Assignee: Bastien DINE
>Priority: Critical
>
> one taskmanager has multiple taskslot, one task fail because of create 
> kafkaProducer fail，the reason for create kafkaProducer fail is 
> “javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=producer-metrics,client-id=producer-3”。 the detail trace 
> is ：
> 2017-11-04 19:41:23,281 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom Source -> Filter -> Map -> Filter -> Sink: 
> dp_client_**_log (7/80) (99551f3f892232d7df5eb9060fa9940c) switched from 
> RUNNING to FAILED.
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:181)
> at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.getKafkaProducer(FlinkKafkaProducerBase.java:202)
> at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.open(FlinkKafkaProducerBase.java:212)
> at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:375)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.producer:type=producer-metrics,client-id=producer-3
> at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
> at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
> at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
> at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:255)
> at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:239)
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.registerMetrics(RecordAccumulator.java:137)
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.(RecordAccumulator.java:111)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:261)
> ... 9 more
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=producer-metrics,client-id=producer-3
> at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
> ... 16 more
> I doubt that task in different taskslot of one taskmanager use different 
> classloader， and taskid may be  the same in one process。 So this lead to 
> create kafkaProducer fail in one taskManager。 
> Does anybody encountered the same problem？ 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.

2020-02-25 Thread GitBox

AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] 
Made AsyncWaitOperator chainable to non-sources.
URL: https://github.com/apache/flink/pull/11177#discussion_r384319001
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java
 ##
 @@ -38,16 +38,46 @@
 * To optimize performance, it is generally a good practice to allow 
maximal
 * chaining and increase operator parallelism.
 */
-   ALWAYS,
+   ALWAYS {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return true;
+   }
+   },
 
/**
 * The operator will not be chained to the preceding or succeeding 
operators.
 */
-   NEVER,
+   NEVER {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
 
/**
 * The operator will not be chained to the predecessor, but successors 
may chain to this
 * operator.
 */
-   HEAD
+   HEAD {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
+
+   /**
+* Operators will be eagerly chained whenever possible, except after 
legacy sources.
+*
+* Operators that will not properly when processInput is called from 
another thread, must use this strategy
+* instead of {@link #ALWAYS}.
+*/
+   HEAD_AFTER_LEGACY_SOURCE {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return !headOperator.isStreamSource();
+   }
+   };
+
+   public abstract boolean canBeChainedTo(StreamOperatorFactory 
headOperator);
 
 Review comment:
   Is yielding the root cause of not chainable to legacy sources? If so, then 
this is a valid way. 
   I'm concerned though that we also disallow chaining of operators that just 
need access to the mailboxExecutor for submit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16282) Wrong exception using DESCRIBE SQL command

2020-02-25 Thread Nico Kruber (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045216#comment-17045216
 ] 

Nico Kruber commented on FLINK-16282:
-

Yes, this ticket is about the error message from trying to use {{DESCRIBE}}

> Wrong exception using DESCRIBE SQL command
> --
>
> Key: FLINK-16282
> URL: https://issues.apache.org/jira/browse/FLINK-16282
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.10.0
>Reporter: Nico Kruber
>Priority: Major
>
> When trying to describe a table like this
> {code:java}
> Table facttable = tEnv.sqlQuery("DESCRIBE fact_table");
> {code}
> currently, you get a strange exception which should rather be a "not 
> supported" exception
> {code}
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. From line 1, column 10 to line 1, column 19: Column 
> 'fact_table' not found in any table
>   at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:130)
>   at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
>   at 
> org.apache.flink.table.sqlexec.SqlToOperationConverter.convert(SqlToOperationConverter.java:124)
>   at org.apache.flink.table.planner.ParserImpl.parse(ParserImpl.java:66)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:464)
>   at com.ververica.LateralTableJoin.main(LateralTableJoin.java:92)
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, 
> column 10 to line 1, column 19: Column 'fact_table' not found in any table
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:834)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:819)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4841)
>   at 
> org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:259)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateIdentifier(SqlValidatorImpl.java:2943)
>   at 
> org.apache.calcite.sql.SqlIdentifier.validateExpr(SqlIdentifier.java:297)
>   at org.apache.calcite.sql.SqlOperator.validateCall(SqlOperator.java:407)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateCall(SqlValidatorImpl.java:5304)
>   at org.apache.calcite.sql.SqlCall.validate(SqlCall.java:116)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:943)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:650)
>   at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:126)
>   ... 5 more
> Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Column 
> 'fact_table' not found in any table
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
>   at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572)
>   ... 17 more
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 28cbb454934766dd36723d24c2e490282778d83d Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/15051) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5599)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] 
remove mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704
 
 
   
   ## CI report:
   
   * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150593392) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.

2020-02-25 Thread GitBox

AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] 
Made AsyncWaitOperator chainable to non-sources.
URL: https://github.com/apache/flink/pull/11177#discussion_r384320045
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java
 ##
 @@ -602,9 +602,8 @@ public static boolean isChainable(StreamEdge edge, 
StreamGraph streamGraph) {
&& outOperator != null
&& headOperator != null
&& 
upStreamVertex.isSameSlotSharingGroup(downStreamVertex)
-   && outOperator.getChainingStrategy() == 
ChainingStrategy.ALWAYS
-   && (headOperator.getChainingStrategy() == 
ChainingStrategy.HEAD ||
-   headOperator.getChainingStrategy() == 
ChainingStrategy.ALWAYS)
+   && 
outOperator.getChainingStrategy().canBeChainedTo(headOperator)
+   && headOperator.getChainingStrategy() != 
ChainingStrategy.NEVER
 
 Review comment:
   A different question imho. First, is upstream supporting chaining at all? 
Second, is this specific downstream chainable to the upstream.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.

2020-02-25 Thread GitBox

AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] 
Made AsyncWaitOperator chainable to non-sources.
URL: https://github.com/apache/flink/pull/11177#discussion_r384319679
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java
 ##
 @@ -38,16 +38,46 @@
 * To optimize performance, it is generally a good practice to allow 
maximal
 * chaining and increase operator parallelism.
 */
-   ALWAYS,
+   ALWAYS {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
 
 Review comment:
   Renamed vars in a separate hotfix.
   
   For the remainder naming, I think it's the other way around. If I construct 
head to tail, I'd ask can this node by chained to the head?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use 
Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407
 
 
   
   ## CI report:
   
   * a69f034cd1be52b9d8e2553fef1263f22dc4fb66 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150404032) 
   * 1d751512b76dfaa57ec1f8f0519c36dd60e47cff UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.

2020-02-25 Thread GitBox

AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] 
Made AsyncWaitOperator chainable to non-sources.
URL: https://github.com/apache/flink/pull/11177#discussion_r384319001
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java
 ##
 @@ -38,16 +38,46 @@
 * To optimize performance, it is generally a good practice to allow 
maximal
 * chaining and increase operator parallelism.
 */
-   ALWAYS,
+   ALWAYS {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return true;
+   }
+   },
 
/**
 * The operator will not be chained to the preceding or succeeding 
operators.
 */
-   NEVER,
+   NEVER {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
 
/**
 * The operator will not be chained to the predecessor, but successors 
may chain to this
 * operator.
 */
-   HEAD
+   HEAD {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
+
+   /**
+* Operators will be eagerly chained whenever possible, except after 
legacy sources.
+*
+* Operators that will not properly when processInput is called from 
another thread, must use this strategy
+* instead of {@link #ALWAYS}.
+*/
+   HEAD_AFTER_LEGACY_SOURCE {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return !headOperator.isStreamSource();
+   }
+   };
+
+   public abstract boolean canBeChainedTo(StreamOperatorFactory 
headOperator);
 
 Review comment:
   Is yielding the root cause of not chainable to legacy sources?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module

2020-02-25 Thread GitBox

tzulitai commented on issue #34: [FLINK-16272] Move e2e test utilities to a 
statefun-e2e-tests-common module
URL: https://github.com/apache/flink-statefun/pull/34#issuecomment-591285761
 
 
   Thanks! Merging ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16144) Add client.timeout setting and use that for CLI operations

2020-02-25 Thread Khaireddine Rezgui (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045214#comment-17045214
 ] 

Khaireddine Rezgui commented on FLINK-16144:


Hi [~sewen], can you assign this task to me :) ?

Thanks.

> Add client.timeout setting and use that for CLI operations
> --
>
> Key: FLINK-16144
> URL: https://issues.apache.org/jira/browse/FLINK-16144
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Reporter: Aljoscha Krettek
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, the Cli uses the {{akka.client.timeout}} setting. This has 
> historical reasons but can be very confusing for users. We should introduce a 
> new setting {{client.timeout}} that is used for the client, with a fallback 
> to the previous {{akka.client.timeout}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.

2020-02-25 Thread GitBox

AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] 
Made AsyncWaitOperator chainable to non-sources.
URL: https://github.com/apache/flink/pull/11177#discussion_r384318641
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java
 ##
 @@ -38,16 +38,46 @@
 * To optimize performance, it is generally a good practice to allow 
maximal
 * chaining and increase operator parallelism.
 */
-   ALWAYS,
+   ALWAYS {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return true;
+   }
+   },
 
/**
 * The operator will not be chained to the preceding or succeeding 
operators.
 */
-   NEVER,
+   NEVER {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
 
/**
 * The operator will not be chained to the predecessor, but successors 
may chain to this
 * operator.
 */
-   HEAD
+   HEAD {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return false;
+   }
+   },
+
+   /**
+* Operators will be eagerly chained whenever possible, except after 
legacy sources.
+*
+* Operators that will not properly when processInput is called from 
another thread, must use this strategy
+* instead of {@link #ALWAYS}.
+*/
+   HEAD_AFTER_LEGACY_SOURCE {
+   @Override
+   public boolean canBeChainedTo(StreamOperatorFactory 
headOperator) {
+   return !headOperator.isStreamSource();
+   }
+   };
+
+   public abstract boolean canBeChainedTo(StreamOperatorFactory 
headOperator);
 
 Review comment:
   It was not. Good catch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes

2020-02-25 Thread 陳昌倬



[ 
https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045213#comment-17045213
 ] 

ChangZhuo Chen (陳昌倬) commented on FLINK-16267:
--

Hi [~liyu],
 * We has setup `state.backend.rocksdb.memory.managed: false` to see if problem 
remains.
 * For memory setting, we setup a dashboard to monitor JVM Heap usage 
(Prometheus metric name: flink_taskmanager_Status_JVM_Memory_Heap_Used), and we 
found that the memory usage is low compare to k8s resources configuration, so:

 ** In 1.9.1, we has increased `taskmanager.heap.size` from `1024m` (default 
from template) to `2000m` to see if throughput can be improved.
 ** In 1.10.0, we has increased `taskmanager.memory.process.size` to match k8s 
resource configuration (4G), and start to lower the value due to `OOMKilled`.

Hi [~yunta],
 * We has setup `state.backend.rocksdb.metrics.block-cache-usage: true` and 
will scrape statistic afterward.
 * We 1 BroadcastProcessFunction, and 1 KeyedProcessFunction with the following 
states:
 ** BroadcastProcessFunction
 *** 1 ListState
 ** KeyedProcessFunction
 *** 1 ValueState
 *** 3 MapState
 * For 1 MapState in KeyedProcessFunction, we do iterate it oftenly as part of 
our business logic.
 * As for `How many RocksDB instances per slot?`, we will update when we get 
the answer from log.

> Flink uses more memory than taskmanager.memory.process.size in Kubernetes
> -
>
> Key: FLINK-16267
> URL: https://issues.apache.org/jira/browse/FLINK-16267
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
> Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, 
> oomkilled_taskmanager.log
>
>
> This issue is from 
> [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes]
> h1. Description
>  * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit 
> the resource used by taskmanager to ensure they are not killed by Kubernetes. 
> However, we still get lots of taskmanager `OOMKilled`. The setup is in the 
> following section.
>  * The taskmanager log is in attachment [^oomkilled_taskmanager.log].
> h2. Kubernete
>  * The Kubernetes setup is the same as described in 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html].
>  * The following is resource configuration for taskmanager deployment in 
> Kubernetes:
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
>  {{  limits:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
> h2. Flink Docker
>  * The Flink docker is built by the following Docker file.
> {{FROM flink:1.10-scala_2.11}}
> RUN mkdir -p /opt/flink/plugins/s3 &&
> ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/
>  {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar 
> /opt/flink/lib/}}
> h2. Flink Configuration
>  * The following are all memory related configurations in `flink-conf.yaml` 
> in 1.10.0:
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.memory.jvm-metaspace.size: 128m}}
>  {{taskmanager.memory.process.size: 4096m}}
>  * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in 
> `flink-conf.yaml`.
>  ** Use S3 as checkpoint storage.
>  * The code uses DateStream API
>  ** input/output are both Kafka.
> h2. Project Dependencies
>  * The following is our dependencies.
> {{val flinkVersion = "1.10.0"}}{{libraryDependencies += 
> "com.squareup.okhttp3" % "okhttp" % "4.2.2"}}
>  {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}}
>  {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion 
> % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" 
> % flinkVersion % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % 
> flinkVersion % "provided"}}
>  {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}}
>  {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}}
>  {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}}
> h2. Previous Flink 1.9.1 Configuration
>  * The configuration we used in Flink 1.9.1 are the following. It does not 
> have `OOMKilled`.
> h3. Kubernetes
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1200m}}
>  {{    memory: 2G}}
>  {{  limits:}}
>  {{    cpu: 1500m}}
>  {{    memory: 2G}}
> h3. Flink 1.9.1
> {{jobmanager.heap.size:

[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove 
useless ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565
 
 
   
   ## CI report:
   
   * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150597282) Azure: 
[SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5598)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16286) Support select from nothing

2020-02-25 Thread Kurt Young (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045203#comment-17045203
 ] 

Kurt Young commented on FLINK-16286:


Isn't this already supported?

> Support select from nothing
> ---
>
> Key: FLINK-16286
> URL: https://issues.apache.org/jira/browse/FLINK-16286
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Jeff Zhang
>Priority: Major
>
> It would be nice to support from noting in flink sql. e.g. 
> {code:java}
> select "name", 1+1, Date() {code}
> This is especially useful when user want to try udf provided by others 
> without creating new table for faked data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhijiangW commented on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.

2020-02-25 Thread GitBox

zhijiangW commented on issue #11155: [FLINK-14818][benchmark] Fix receiving 
InputGate setup of StreamNetworkBenchmarkEnvironment.
URL: https://github.com/apache/flink/pull/11155#issuecomment-591277621
 
 
   @wsry I also updated the benchmark results with the same connection amount 
in [link](https://github.com/apache/flink/pull/11155#issuecomment-590753152). 
The conclusion is that all the cases for network throughput have regression 
except for 100, 100ms. I am not sure whether we can explain this corner case 
properly. But I guess it would not block merging since it should be the right 
way to go.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a 
deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#issuecomment-589965444
 
 
   
   ## CI report:
   
   * 0cc972cf542776a79767c119b111b3fa2531b9b9 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150601347) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove 
useless ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565
 
 
   
   ## CI report:
   
   * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150597282) Azure: 
[SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5598)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] 
remove mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704
 
 
   
   ## CI report:
   
   * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150593392) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] igalshilman commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module

2020-02-25 Thread GitBox

igalshilman commented on issue #34: [FLINK-16272] Move e2e test utilities to a 
statefun-e2e-tests-common module
URL: https://github.com/apache/flink-statefun/pull/34#issuecomment-591276273
 
 
   Thanks,
   LGTM!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] PengTaoWW commented on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

PengTaoWW commented on issue #10890: [FLINK-15198][Deployment / Mesos] remove 
mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-591276048
 
 
   @flinkbot run travis


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.

2020-02-25 Thread GitBox

zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix 
receiving InputGate setup of StreamNetworkBenchmarkEnvironment.
URL: https://github.com/apache/flink/pull/11155#issuecomment-590753152
 
 
   Also attach the comparison results executed in benchmark machine:
   
   
   Baseline |   |  
   -- | -- | --
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 51977.13475 |  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1164.463375 |  
   StreamNetworkThroughputBenchmarkExecutor| 69553.22069 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 17430.11747 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 51019.51607 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 55423.23278 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 14924.65899 | 1000,100ms,SSL
   
   less gates with less connections |   |  
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 33215.6767 |  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1189.239694 |  
   StreamNetworkThroughputBenchmarkExecutor| 72450.69576 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 17018.46785 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 45829.91493 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 52943.20546 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 13762.96631 | 1000,100ms,SSL
   
   less gates with same connections |   |  
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 32342.12887|  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1136.44912 |  
   StreamNetworkThroughputBenchmarkExecutor| 75151.951622 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 16878.544466 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 47041.479598 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 52384.517795 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 13824.809088 | 1000,100ms,SSL


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.

2020-02-25 Thread GitBox

zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix 
receiving InputGate setup of StreamNetworkBenchmarkEnvironment.
URL: https://github.com/apache/flink/pull/11155#issuecomment-590753152
 
 
   Also attach the comparison results executed in benchmark machine:
   
   
   Baseline |   |  
   -- | -- | --
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 51977.13475 |  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1164.463375 |  
   StreamNetworkThroughputBenchmarkExecutor| 69553.22069 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 17430.11747 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 51019.51607 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 55423.23278 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 14924.65899 | 1000,100ms,SSL
   less gates with less connections |   |  
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 33215.6767 |  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1189.239694 |  
   StreamNetworkThroughputBenchmarkExecutor| 72450.69576 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 17018.46785 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 45829.91493 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 52943.20546 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 13762.96631 | 1000,100ms,SSL
   less gates with same connections |   |  
   DataSkewStreamNetworkThroughputBenchmarkExecutor| 32342.12887|  
   StreamNetworkBroadcastThroughputBenchmarkExecutor| 1136.44912 |  
   StreamNetworkThroughputBenchmarkExecutor| 75151.951622 | 100,100ms
   StreamNetworkThroughputBenchmarkExecutor| 16878.544466 | 100,100ms,SSL
   StreamNetworkThroughputBenchmarkExecutor| 47041.479598 | 1000,1ms
   StreamNetworkThroughputBenchmarkExecutor| 52384.517795 | 1000,100ms
   StreamNetworkThroughputBenchmarkExecutor| 13824.809088 | 1000,100ms,SSL


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use 
LinkedHashSet for a deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#discussion_r384304115
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java
 ##
 @@ -785,10 +785,19 @@ public void 
testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled()
final List verticesSorted = 
jobGraph.getVerticesSortedTopologicallyFromSources();
assertEquals(4, verticesSorted.size());
 
-   final JobVertex source1Vertex = verticesSorted.get(0);
-   final JobVertex source2Vertex = verticesSorted.get(1);
-   final JobVertex map1Vertex = verticesSorted.get(2);
-   final JobVertex map2Vertex = verticesSorted.get(3);
+   JobVertex source1Vertex = null, source2Vertex = null, 
map1Vertex = null, map2Vertex= null;
+   for (int i = 0; i < 4; i++)
+   {
 
 Review comment:
   The left brace should be in the same line with the for statement.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use 
LinkedHashSet for a deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#discussion_r384305119
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java
 ##
 @@ -785,10 +785,19 @@ public void 
testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled()
final List verticesSorted = 
jobGraph.getVerticesSortedTopologicallyFromSources();
assertEquals(4, verticesSorted.size());
 
-   final JobVertex source1Vertex = verticesSorted.get(0);
-   final JobVertex source2Vertex = verticesSorted.get(1);
-   final JobVertex map1Vertex = verticesSorted.get(2);
-   final JobVertex map2Vertex = verticesSorted.get(3);
+   JobVertex source1Vertex = null, source2Vertex = null, 
map1Vertex = null, map2Vertex= null;
+   for (int i = 0; i < 4; i++)
+   {
+   JobVertex vertex = verticesSorted.get(i);
+   if (vertex.getName().equals("Source: source1"))
 
 Review comment:
   I'd prefer to change the matching to `if 
(vertex.getName().contains("source1"))`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use 
LinkedHashSet for a deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#discussion_r384304446
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java
 ##
 @@ -785,10 +785,19 @@ public void 
testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled()
final List verticesSorted = 
jobGraph.getVerticesSortedTopologicallyFromSources();
assertEquals(4, verticesSorted.size());
 
-   final JobVertex source1Vertex = verticesSorted.get(0);
-   final JobVertex source2Vertex = verticesSorted.get(1);
-   final JobVertex map1Vertex = verticesSorted.get(2);
-   final JobVertex map2Vertex = verticesSorted.get(3);
+   JobVertex source1Vertex = null, source2Vertex = null, 
map1Vertex = null, map2Vertex= null;
+   for (int i = 0; i < 4; i++)
+   {
+   JobVertex vertex = verticesSorted.get(i);
+   if (vertex.getName().equals("Source: source1"))
 
 Review comment:
   if body should always be surrounded by braces.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use 
LinkedHashSet for a deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#discussion_r384303864
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java
 ##
 @@ -785,10 +785,19 @@ public void 
testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled()
final List verticesSorted = 
jobGraph.getVerticesSortedTopologicallyFromSources();
assertEquals(4, verticesSorted.size());
 
-   final JobVertex source1Vertex = verticesSorted.get(0);
-   final JobVertex source2Vertex = verticesSorted.get(1);
-   final JobVertex map1Vertex = verticesSorted.get(2);
-   final JobVertex map2Vertex = verticesSorted.get(3);
+   JobVertex source1Vertex = null, source2Vertex = null, 
map1Vertex = null, map2Vertex= null;
 
 Review comment:
   we should not declare multiple variables in one line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use 
LinkedHashSet for a deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#discussion_r384305514
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java
 ##
 @@ -805,10 +814,19 @@ public void 
testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultDisabled(
final List verticesSorted = 
jobGraph.getVerticesSortedTopologicallyFromSources();
assertEquals(4, verticesSorted.size());
 
-   final JobVertex source1Vertex = verticesSorted.get(0);
-   final JobVertex source2Vertex = verticesSorted.get(1);
-   final JobVertex map1Vertex = verticesSorted.get(2);
-   final JobVertex map2Vertex = verticesSorted.get(3);
+   JobVertex source1Vertex = null, source2Vertex = null, 
map1Vertex = null, map2Vertex= null;
+   for (int i = 0; i < 4; i++)
 
 Review comment:
   we could extract this matching block to a common method to avoid 
duplication. It can return a list of JobVertices in the expected order.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298074
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.common
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.api.dag.Transformation
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.streaming.api.transformations.OneInputTransformation
+import org.apache.flink.table.dataformat.BaseRow
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory
+import 
org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import 
org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan
+import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
 
 Review comment:
   protected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a 
deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#issuecomment-589965444
 
 
   
   ## CI report:
   
   * 7f887f58c7bd62dbc531b5e554555d1f409b3981 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150152750) 
   * 0cc972cf542776a79767c119b111b3fa2531b9b9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298384
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import 
org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
 
 Review comment:
   private?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298406
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import 
org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
+  config: Configuration,
+  inputRowType: RowType,
+  outputRowType: RowType,
+  pythonFunctionInfo: PythonFunctionInfo,
+  udtfInputOffsets: Array[Int],
+  joinType: JoinRelType): OneInputStreamOperator[CRow, CRow] = {
+val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME)
+val ctor = clazz.getConstructor(
+  classOf[Configuration],
+  classOf[PythonFunctionInfo],
+  classOf[RowType],
+  classOf[RowType],
+  classOf[Array[Int]],
+  classOf[JoinRelType])
+ctor.newInstance(
+  config,
+  pythonFunctionInfo,
+  inputRowType,
+  outputRowType,
+  udtfInputOffsets,
+  joinType)
+  .asInstanceOf[OneInputStreamOperator[CRow, CRow]]
+  }
+
+  private[flink] def extractPythonTableFunctionInfo(
 
 Review comment:
   private?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594)
 
   * 28cbb454934766dd36723d24c2e490282778d83d Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/15051) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298213
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.common
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.api.dag.Transformation
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.streaming.api.transformations.OneInputTransformation
+import org.apache.flink.table.dataformat.BaseRow
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory
+import 
org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import 
org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan
+import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
+  config: Configuration,
+  inputRowType: BaseRowTypeInfo,
+  outputRowType: BaseRowTypeInfo,
+  pythonFunctionInfo: PythonFunctionInfo,
+  udtfInputOffsets: Array[Int],
+  joinType: JoinRelType): OneInputStreamOperator[BaseRow, BaseRow] = {
+val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME)
+val ctor = clazz.getConstructor(
+  classOf[Configuration],
+  classOf[PythonFunctionInfo],
+  classOf[RowType],
+  classOf[RowType],
+  classOf[Array[Int]],
+  classOf[JoinRelType])
+ctor.newInstance(
+  config,
+  pythonFunctionInfo,
+  inputRowType.toRowType,
+  outputRowType.toRowType,
+  udtfInputOffsets,
+  joinType)
+  .asInstanceOf[OneInputStreamOperator[BaseRow, BaseRow]]
+  }
+
+  private[flink] def extractPythonTableFunctionInfo(
 
 Review comment:
   protected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16286) Support select from nothing

2020-02-25 Thread Jeff Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated FLINK-16286:
---
Component/s: Table SQL / API

> Support select from nothing
> ---
>
> Key: FLINK-16286
> URL: https://issues.apache.org/jira/browse/FLINK-16286
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Jeff Zhang
>Priority: Major
>
> It would be nice to support from noting in flink sql. e.g. 
> {code:java}
> select "name", 1+1, Date() {code}
> This is especially useful when user want to try udf provided by others 
> without creating new table for faked data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16286) Support select from nothing

2020-02-25 Thread Jeff Zhang (Jira)

Jeff Zhang created FLINK-16286:
--

 Summary: Support select from nothing
 Key: FLINK-16286
 URL: https://issues.apache.org/jira/browse/FLINK-16286
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.10.0
Reporter: Jeff Zhang


It would be nice to support from noting in flink sql. e.g. 
{code:java}
select "name", 1+1, Date() {code}
This is especially useful when user want to try udf provided by others without 
creating new table for faked data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16251) Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-16251.
---
Resolution: Fixed

Merged to master via 1ada2997254d08b0baa15484d68b93250098289c

> Optimize the cost of function call  in ScalarFunctionOpertation
> ---
>
> Key: FLINK-16251
> URL: https://issues.apache.org/jira/browse/FLINK-16251
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, there are too many extra function calls cost in  
> ScalarFunctionOpertation.We need to optimize it to improve performance in 
> Python UDF.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] dianfu merged pull request #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

dianfu merged pull request #11203: [FLINK-16251][python] Optimize the cost of 
function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594)
 
   * 28cbb454934766dd36723d24c2e490282778d83d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] cpugputpu commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

cpugputpu commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a 
deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#issuecomment-591267279
 
 
   Thanks for your valuable comments! I am sorry that my previous fix was not a 
good one. Now I fix it without changing StreamGraph.java.
   @flinkbot run travis


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298406
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import 
org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
+  config: Configuration,
+  inputRowType: RowType,
+  outputRowType: RowType,
+  pythonFunctionInfo: PythonFunctionInfo,
+  udtfInputOffsets: Array[Int],
+  joinType: JoinRelType): OneInputStreamOperator[CRow, CRow] = {
+val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME)
+val ctor = clazz.getConstructor(
+  classOf[Configuration],
+  classOf[PythonFunctionInfo],
+  classOf[RowType],
+  classOf[RowType],
+  classOf[Array[Int]],
+  classOf[JoinRelType])
+ctor.newInstance(
+  config,
+  pythonFunctionInfo,
+  inputRowType,
+  outputRowType,
+  udtfInputOffsets,
+  joinType)
+  .asInstanceOf[OneInputStreamOperator[CRow, CRow]]
+  }
+
+  private[flink] def extractPythonTableFunctionInfo(
 
 Review comment:
   private?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298384
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import 
org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
 
 Review comment:
   private?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298213
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.common
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.api.dag.Transformation
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.streaming.api.transformations.OneInputTransformation
+import org.apache.flink.table.dataformat.BaseRow
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory
+import 
org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import 
org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan
+import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
+  config: Configuration,
+  inputRowType: BaseRowTypeInfo,
+  outputRowType: BaseRowTypeInfo,
+  pythonFunctionInfo: PythonFunctionInfo,
+  udtfInputOffsets: Array[Int],
+  joinType: JoinRelType): OneInputStreamOperator[BaseRow, BaseRow] = {
+val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME)
+val ctor = clazz.getConstructor(
+  classOf[Configuration],
+  classOf[PythonFunctionInfo],
+  classOf[RowType],
+  classOf[RowType],
+  classOf[Array[Int]],
+  classOf[JoinRelType])
+ctor.newInstance(
+  config,
+  pythonFunctionInfo,
+  inputRowType.toRowType,
+  outputRowType.toRowType,
+  udtfInputOffsets,
+  joinType)
+  .asInstanceOf[OneInputStreamOperator[BaseRow, BaseRow]]
+  }
+
+  private[flink] def extractPythonTableFunctionInfo(
 
 Review comment:
   protected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of

2020-02-25 Thread GitBox

HuangXingBo commented on a change in pull request #11130: 
[FLINK-15972][python][table-planner][table-planner-blink] Add Python building 
blocks to make sure the basic functionality of Python TableFunction could work
URL: https://github.com/apache/flink/pull/11130#discussion_r384298074
 
 

 ##
 File path: 
flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.common
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode}
+import org.apache.flink.api.dag.Transformation
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator
+import org.apache.flink.streaming.api.transformations.OneInputTransformation
+import org.apache.flink.table.dataformat.BaseRow
+import org.apache.flink.table.functions.python.PythonFunctionInfo
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory
+import 
org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME
+import 
org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan
+import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo
+import org.apache.flink.table.types.logical.RowType
+
+import scala.collection.mutable
+
+trait CommonPythonCorrelate extends CommonPythonBase {
+  def getPythonTableFunctionOperator(
 
 Review comment:
   protected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove 
useless ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565
 
 
   
   ## CI report:
   
   * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150597282) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

wuchong commented on issue #11217: [FLINK-16283][table-planner] Fix potential 
NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591261168
 
 
   @JingsongLi , good catch! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless 
ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565
 
 
   
   ## CI report:
   
   * 75bddc5732f99c88a183b683660a85202ac8c5cb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11218: [FLINK-16285][network] Refactor 
SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
URL: https://github.com/apache/flink/pull/11218#issuecomment-591253355
 
 
   
   ## CI report:
   
   * 7b8f569390e60c187f3282763813b44a43ff9cc9 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150596303) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16274) Add typed builder methods for setting dynamic configuration on StatefulFunctionsAppContainers

2020-02-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16274:
---
Labels: pull-request-available  (was: )

> Add typed builder methods for setting dynamic configuration on 
> StatefulFunctionsAppContainers
> -
>
> Key: FLINK-16274
> URL: https://issues.apache.org/jira/browse/FLINK-16274
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions, Test Infrastructure
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
>
> Excerpt from: 
> https://github.com/apache/flink-statefun/pull/32#discussion_r383644382
> Currently, you'd need to provide a complete {{Configuration}} as dynamic 
> properties when constructing a {{StatefulFunctionsAppContainers}}.
> It'll be nicer if this is built like this:
> {code}
> public StatefulFunctionsAppContainers verificationApp =
> new StatefulFunctionsAppContainers("sanity-verification", 2)
> .withModuleGlobalConfiguration("kafka-broker", 
> kafka.getBootstrapServers())
> .withConfiguration(ConfigOption option, configValue)
> {code}
> And by default the {{StatefulFunctionsAppContainers}} just only has the 
> configs in the base template {{flink-conf.yaml}}.
> This would require lazy construction of the containers on {{beforeTest}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-statefun] tzulitai opened a new pull request #35: [FLINK-16274] Support inlined configuration in StatefulFunctionsAppContainers

2020-02-25 Thread GitBox

tzulitai opened a new pull request #35: [FLINK-16274] Support inlined 
configuration in StatefulFunctionsAppContainers
URL: https://github.com/apache/flink-statefun/pull/35
 
 
   This PR is based on #34. The first 4 commits are not relevant.
   
   ---
   
   ### Goal
   
   This PR achieves the following things:
   - Change the `StatefulFunctionsAppContainers` to a more "builder-ish" 
approach, which the user provides settings on how the containers are built, and 
only on `before()` the test we build the containers and application images. 
This would allow a more fluent API to inline configuration settings.
   - Add inline configuration setting methods, like so:
   ```
   @Rule
   public StatefulFunctionsAppContainers verificationApp =
   new StatefulFunctionsAppContainers("app-name", numWorkers)
   .withModuleGlobalConfiguration("key", "value");
   .withConfiguration(ConfigOption option, T value);
   ```
   This also eliminates the need for users of `StatefulFunctionsAppContainers` 
to maintain a `flink-conf.yaml` in their classpath, as they can now build the 
configuration fully programmatically with the new inline configuration methods.
   
   ---
   
   ### Verifying
   
   This is only changes in tests.
   Run `mvn clean verify -Prun-e2e-tests` to verify changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build

2020-02-25 Thread GitBox

hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS 
SDK MBean registry from Kinesis build
URL: https://github.com/apache/flink/pull/11196#issuecomment-591253938
 
 
   Although I am also not sure what would be the ramification of 
excluding`SdkMBeanRegistrySupport` to other depending classes, it seems like 
no-op replacement seems safe and clever.
   
   Related to `FlinkKinesisProducer#close()` solution, would it be viable to 
include `AwsSdkMetrics#unregisterMetricAdminMbean()` inside 
`BlobLibraryCacheManager#releaseClassLoader` to get that unregistered once all 
the tasks on the job are closed? But it seems too specific call.
   
   At any rate, we may not make sure that removing this can resolve the leak, 
right? (if I am reading FLINK-16142 correctly). AFAIR, KPL also has statically 
launched daemon thread (file age manager), which is not shut down on task close 
and holds on to the class loaded by user class loader, leading to leak. In this 
case, not sure how just removing this MBean registry would get things better. 
We've seen OOM killed with unlimited metaspace, but it seems like it hits 
metaspace OOM before eating up all the memory as of 1.10 (which we have not run 
yet).
   
   Curious if you have measured number of class loaders before/after the fix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread GitBox

flinkbot commented on issue #11218: [FLINK-16285][network] Refactor 
SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
URL: https://github.com/apache/flink/pull/11218#issuecomment-591253355
 
 
   
   ## CI report:
   
   * 7b8f569390e60c187f3282763813b44a43ff9cc9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless 
ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219#issuecomment-591252807
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 75bddc5732f99c88a183b683660a85202ac8c5cb (Wed Feb 26 
05:51:09 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16257) Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16257:
---
Labels: pull-request-available  (was: )

> Remove useless ResultPartitionID from AddCredit message
> ---
>
> Key: FLINK-16257
> URL: https://issues.apache.org/jira/browse/FLINK-16257
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Zhijiang
>Assignee: Zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> The ResultPartitionID in AddCredit message is never used on upstream side, so 
> we can remove it to cleanup the codes. There would have another two benefits 
> to do so:
>  # Reduce the total message size from previous 52 bytes to 20 bytes.
>  # Decouple the dependency with `InputChannel#getPartitionId`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhijiangW opened a new pull request #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message

2020-02-25 Thread GitBox

zhijiangW opened a new pull request #11219: [FLINK-16257][network] Remove 
useless ResultPartitionID from AddCredit message
URL: https://github.com/apache/flink/pull/11219
 
 
   ## What is the purpose of the change
   
   The `ResultPartitionID` in AddCredit message is never used on upstream side, 
so we can remove it to cleanup the codes. There would have another two benefits 
to do so:

   1. Reduce the total message size from previous 52 bytes to 20 bytes.
   2. Decouple the dependency with `InputChannel#getPartitionId`.
   
   ## Brief change log
   
 - *Remove `ResultPartitionID` from `AddCredit` message*
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module

2020-02-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16272:
---
Labels: pull-request-available  (was: )

> Move Stateful Functions end-to-end test framework classes to common module
> --
>
> Key: FLINK-16272
> URL: https://issues.apache.org/jira/browse/FLINK-16272
> Project: Flink
>  Issue Type: Task
>  Components: Stateful Functions, Test Infrastructure
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Major
>  Labels: pull-request-available
>
> There are some end-to-end test utilities that other end-to-end Stateful 
> Function tests can benefit from, such as:
> * {{StatefulFunctionsAppContainers}}
> * {{KafkaIOVerifier}}
> * {{KafkaProtobufSerDe}}
> We should move this to a new common module under 
> {{statefun-end-to-end-tests}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-statefun] tzulitai opened a new pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module

2020-02-25 Thread GitBox

tzulitai opened a new pull request #34: [FLINK-16272] Move e2e test utilities 
to a statefun-e2e-tests-common module
URL: https://github.com/apache/flink-statefun/pull/34
 
 
   This PR achieves 2 simple things:
   - Move `StatefulFunctionsAppContainers` / `KafkaIOVerifier` / 
`ProtobufSerDe` to a common `statefun-e2e-tests-common` module so that new e2e 
tests can use them.
   - Rename all e2e related classes / packages / modules to `*e2e*`. End-to-end 
test modules living under `statefun-e2e-tests` should now be named 
`statefun-*-e2e`, and test classes should be named `*E2E.java`.
   
   ---
   
   ### Verifying
   
   Run `mvn clean verify -Prun-e2e-tests`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread Zhijiang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang updated FLINK-16285:
-
Fix Version/s: 1.11.0

> Refactor SingleInputGate#setInputChannel to remove 
> IntermediateResultPartitionID argument
> -
>
> Key: FLINK-16285
> URL: https://issues.apache.org/jira/browse/FLINK-16285
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Zhijiang
>Assignee: Zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The IntermediateResultPartitionID info can be got directly from the 
> respective InputChannel, so we can remove it from the arguments of 
> SingleInputGate#setInputChannel to cleanup the codes.
> It is also helpful to simplify the unit tests and avoid passing the 
> inconsistent IntermediateResultPartitionID with the internal 
> ResultPartitionID that the respective InputChannel maintains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Kurt Young (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045155#comment-17045155
 ] 

Kurt Young commented on FLINK-14807:


A side comment: if we want this API fully deal with fault tolerance and 
consistency, it's equivalent to implement an end-to-end exactly once support 
sink. 

> Add Table#collect api for fetching data to client
> -
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.9.1
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread GitBox

flinkbot commented on issue #11218: [FLINK-16285][network] Refactor 
SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
URL: https://github.com/apache/flink/pull/11218#issuecomment-591248775
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit bb563801b8d0724c79847ed62c8a7add2043c48e (Wed Feb 26 
05:39:44 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16285:
---
Labels: pull-request-available  (was: )

> Refactor SingleInputGate#setInputChannel to remove 
> IntermediateResultPartitionID argument
> -
>
> Key: FLINK-16285
> URL: https://issues.apache.org/jira/browse/FLINK-16285
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Zhijiang
>Assignee: Zhijiang
>Priority: Minor
>  Labels: pull-request-available
>
> The IntermediateResultPartitionID info can be got directly from the 
> respective InputChannel, so we can remove it from the arguments of 
> SingleInputGate#setInputChannel to cleanup the codes.
> It is also helpful to simplify the unit tests and avoid passing the 
> inconsistent IntermediateResultPartitionID with the internal 
> ResultPartitionID that the respective InputChannel maintains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhijiangW opened a new pull request #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread GitBox

zhijiangW opened a new pull request #11218: [FLINK-16285][network] Refactor 
SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
URL: https://github.com/apache/flink/pull/11218
 
 
   ## What is the purpose of the change
   
   *The `IntermediateResultPartitionID` info can be got directly from the 
respective `InputChannel`, so we can remove it from the arguments of 
`SingleInputGate#setInputChannel` to cleanup the codes.*
   
   *It is also helpful to simplify the unit tests and avoid passing the 
inconsistent `IntermediateResultPartitionID` with the internal 
ResultPartitionID that the respective `InputChannel` maintains.*
   
   ## Brief change log
   
 - *Remove the argument of `IntermediateResultPartitionID` from 
`SingleInputGate#setInputChannel `
 - *Adjust the related unit tests*
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] 
remove mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704
 
 
   
   ## CI report:
   
   * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150593392) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread Zhijiang (Jira)

Zhijiang created FLINK-16285:


 Summary: Refactor SingleInputGate#setInputChannel to remove 
IntermediateResultPartitionID argument
 Key: FLINK-16285
 URL: https://issues.apache.org/jira/browse/FLINK-16285
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Zhijiang
Assignee: Zhijiang


The IntermediateResultPartitionID info can be got directly from the respective 
InputChannel, so we can remove it from the arguments of 
SingleInputGate#setInputChannel to cleanup the codes.

It is also helpful to simplify the unit tests and avoid passing the 
inconsistent IntermediateResultPartitionID with the internal ResultPartitionID 
that the respective InputChannel maintains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16284) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument

2020-02-25 Thread Zhijiang (Jira)

Zhijiang created FLINK-16284:


 Summary: Refactor SingleInputGate#setInputChannel to remove 
IntermediateResultPartitionID argument
 Key: FLINK-16284
 URL: https://issues.apache.org/jira/browse/FLINK-16284
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Zhijiang
Assignee: Zhijiang


The IntermediateResultPartitionID info can be got directly from the respective 
InputChannel, so we can remove it from the arguments of 
SingleInputGate#setInputChannel to cleanup the codes.

It is also helpful to simplify the unit tests and avoid passing the 
inconsistent IntermediateResultPartitionID with the internal ResultPartitionID 
that the respective InputChannel maintains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the 
cost of function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739
 
 
   
   ## CI report:
   
   * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN
   * 10a803dd074ee8be7ee6e15f93eeb417e2398392 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150588163) Azure: 
[SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5595)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] 
remove mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704
 
 
   
   ## CI report:
   
   * 7c3daddc8a7a168912b63b528f719c967a2b76b8 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150448826) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5559)
 
   * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150593392) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] 
remove mesos.resourcemanager.tasks.mem
URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704
 
 
   
   ## CI report:
   
   * 7c3daddc8a7a168912b63b528f719c967a2b76b8 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150448826) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5559)
 
   * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2020-02-25 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045135#comment-17045135
 ] 

Zhu Zhu edited comment on FLINK-16069 at 2/26/20 4:43 AM:
--

Making building deployment descriptor more static seems not to be an easy work. 
The critical path is the 
{{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> 
getConsumedPartitionShuffleDescriptors()}} which generates 
{{InputGateDeploymentDescriptor}} for a task. This relies on 
{{ShuffleDescriptor}} which are created from the producer tasks's 
{{ResultPartitionDeploymentDescriptor}}. 
{{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only 
after its producer has acquired the slot to determine the location. Besides 
that, even if we can create the {{InputGateDeploymentDescriptor}} in a static 
way and update it with the {{ResultPartitionDeploymentDescriptor}} when 
scheduling, seems we do not avoid the 8000x8000 complexity to compute the 
{{ShuffleDescriptor}}.

I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by 
caching generated {{ShuffleDescriptor}}. e.g. In the case that 
A(parallelism=8000) -> B(parallelism=8000), each 
{{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 
{{ShuffleDescriptor}}. So it is not needed to generate these 8000 
shuffleDescriptors for 8000 times. 


was (Author: zhuzh):
Making building deployment descriptor more static seems not to be an easy work. 
The critical path is the 
{{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> 
getConsumedPartitionShuffleDescriptors()}} which generates 
{{InputGateDeploymentDescriptor}}s for a task. This relies on 
{{ShuffleDescriptor}}s which are created from the producer tasks's 
{{ResultPartitionDeploymentDescriptor}}s. 
{{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only 
after its producer has acquired the slot to determine the location. Besides 
that, even if we can create the {{InputGateDeploymentDescriptor}} in a static 
way and update it with the {{ResultPartitionDeploymentDescriptor}} when 
scheduling, seems we do not avoid the 8000x8000 complexity to compute the 
{{ShuffleDescriptor}}s.

I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by 
caching generated {{ShuffleDescriptor}}s. e.g. In the case that 
A(parallelism=8000) -> B(parallelism=8000), each 
{{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 
{{ShuffleDescriptor}}s. So it is not needed to generate these 8000 
shuffleDescriptors for 8000 times. 

> Creation of TaskDeploymentDescriptor can block main thread for long time
> 
>
> Key: FLINK-16069
> URL: https://issues.apache.org/jira/browse/FLINK-16069
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: huweihua
>Priority: Major
>
> The deploy of tasks will take long time when we submit a high parallelism 
> job. And Execution#deploy run in mainThread, so it will block JobMaster 
> process other akka messages, such as Heartbeat. The creation of 
> TaskDeploymentDescriptor take most of time. We can put the creation in future.
> For example, A job [source(8000)->sink(8000)], the total 16000 tasks from 
> SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of 
> TaskManager timeout and job never success.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2020-02-25 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045135#comment-17045135
 ] 

Zhu Zhu commented on FLINK-16069:
-

Making building deployment descriptor more static seems not to be an easy work. 
The critical path is the 
{{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> 
getConsumedPartitionShuffleDescriptors()}} which generates 
{{InputGateDeploymentDescriptor}}s for a task. This relies on 
{{ShuffleDescriptor}}s which are created from the producer tasks's 
{{ResultPartitionDeploymentDescriptor}}s. 
{{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only 
after its producer has acquired the slot to determine the location. Besides 
that, even if we can create the {{InputGateDeploymentDescriptor}} in a static 
way and update it with the {{ResultPartitionDeploymentDescriptor}} when 
scheduling, seems we do not avoid the 8000x8000 complexity to compute the 
{{ShuffleDescriptor}}s.

I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by 
caching generated {{ShuffleDescriptor}}s. e.g. In the case that 
A(parallelism=8000) -> B(parallelism=8000), each 
{{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 
{{ShuffleDescriptor}}s. So it is not needed to generate these 8000 
shuffleDescriptors for 8000 times. 

> Creation of TaskDeploymentDescriptor can block main thread for long time
> 
>
> Key: FLINK-16069
> URL: https://issues.apache.org/jira/browse/FLINK-16069
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: huweihua
>Priority: Major
>
> The deploy of tasks will take long time when we submit a high parallelism 
> job. And Execution#deploy run in mainThread, so it will block JobMaster 
> process other akka messages, such as Heartbeat. The creation of 
> TaskDeploymentDescriptor take most of time. We can put the creation in future.
> For example, A job [source(8000)->sink(8000)], the total 16000 tasks from 
> SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of 
> TaskManager timeout and job never success.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] bowenli86 edited a comment on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition …

2020-02-25 Thread GitBox

bowenli86 edited a comment on issue #11175: [FLINK-16197][hive] Failed to query 
partitioned table when partition …
URL: https://github.com/apache/flink/pull/11175#issuecomment-591232361
 
 
   I have a slightly different opinion on this. Though it might mitigate the 
problem for users, we are indeed trying to hack the way around invalid inputs 
passed to our APIs. E.g. flink's file source will fail if the target is 
missing, rather than mitigate for users.
   
   Maybe @KurtYoung or @JingsongLi can give some opinions and help to merge if 
they think this is a proper fix?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] bowenli86 commented on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition …

2020-02-25 Thread GitBox

bowenli86 commented on issue #11175: [FLINK-16197][hive] Failed to query 
partitioned table when partition …
URL: https://github.com/apache/flink/pull/11175#issuecomment-591232361
 
 
   I have a slightly different opinion on this. Though it might mitigate the 
problem for users, we are indeed trying to hack the way around invalid inputs 
passed to our APIs. E.g. flink's file source will fail if the target is 
missing, rather than mitigate for users.
   
   I'm generally neutral on this PR. Maybe @KurtYoung or @JingsongLi can give 
some opinions and help to merge if they think this is a proper fix?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the 
cost of function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739
 
 
   
   ## CI report:
   
   * 66f60932087c6c77fff749c6670f64951db3a40f Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) 
   * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN
   * 10a803dd074ee8be7ee6e15f93eeb417e2398392 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150588163) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the 
cost of function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739
 
 
   
   ## CI report:
   
   * 66f60932087c6c77fff749c6670f64951db3a40f Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) 
   * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN
   * 10a803dd074ee8be7ee6e15f93eeb417e2398392 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373
 ] 

Caizhi Weng edited comment on FLINK-14807 at 2/26/20 4:01 AM:
--

Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can not directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

So I come up with the following design. The new sink will only have 1 
parallelism. The class names and API names below are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the job master by the existing GlobalAggregateManager class.
 # When client want to fetch a portion of the query result, it will contact the 
JobManager.
 # The JobManager receives the RPC call and ask the socket server for results 
with a maximum size.
 # The socket server returns some results to the JobMaster.
 # JobMaster forwards the result to the client.

Some Q for the design above:
 * TM memories are protected by backpressuring. Do we have to introduce a new 
config option to set the maximum memory usage of the sink?
 >> Yes
 * What if the client disconnects / does not connect?
 >> The job will not finish as the sink isn't finished. The sink blocks on the 
 >> invoke method if its memory is full or blocks on the close method if not 
 >> all results have been read by the client. We can also add a timeout check 
 >> to the sink and let the sink fail if client goes away for a certain period 
 >> of time.
 * How to deal with retract / upsert streams?
 >> The return type will be Tuple2 where the first boolean value 
 >> indicates this row is an appending row or a retracting row.
 * Is the 1st step necessary?
 >> Yes, because the port of the socket server is unknown before created.

Some problems to discuss
 * Is the whole design necessary? Why don't we use accumulators to store and 
send results?
 >> Accumulators cannot deal with large results. But apart from this 
 >> accumulators seem to be OK. We can limit the maximum number of rows 
 >> provided by Table#collect and use accumulators to send results to the 
 >> JobMaster with each TM heartbeat. After collecting enough results the 
 >> client can cancel the job. The biggest problem is that for streaming jobs 
 >> we might have to wait until the next heartbeat (which is 10s by default) to 
 >> get the results and decide whether to cancel the job.
 * What if the job restarts?
 >> This is a problem about what kind of API we want to provide. So this is 
 >> actually a problem about motivation. Do we really need a (nearly) 
 >> production-ready API or an API just for demo and testing?
 ** If we can tolerate an at least once API this does not seem to be a problem. 
We can attach the index and the version (increased with each job restart) of 
each results when sending them back to the client and let the client deal with 
all these version things.
 ** If we want an exactly once API I'm afraid this is very difficult. For batch 
jobs we have to use external storage and for streaming jobs we might have to 
force users to use checkpoints. Backpressures from sink may also affect 
checkpoints.


was (Author: tsreaper):
Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can not directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

So I come up with the following design. The new sink will only have 1 
parallelism. The class names and API names below are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the

[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes

2020-02-25 Thread Yun Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045126#comment-17045126
 ] 

Yun Tang commented on FLINK-16267:
--

[~czchen] From the taskmanager log, I cannot see how much memory reserved from 
RocksDB side, it should be about 1423MB and you can search you taskmanger log 
to search logs like: "Obtained shared RocksDB cache of size xxx bytes" to 
confirm.

First of all, let us open the metrics of block cache usage from RocksDB side 
via turn 
[state.backend.rocksdb.metrics.block-cache-usage|https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#state-backend-rocksdb-metrics-block-cache-usage]
 as true to see the actual RocksDB memory usage. You can add that with 
memory-used metrics {{Status.JVM.Memory.Heap.Used}}, 
{{Status.JVM.Memory.NonHeap.Used}}, {{Status.JVM.Memory.Direct.MemoryUsed}} and 
 {{Status.JVM.Memory.Mapped.MemoryUsed}} to see how much the total memory used 
per task manager. This targets to detect whether RocksDB lead the OOMkilled.

Moreover, please also provide some information related with RocksDB:
 # How many states you use per operator?
 # Did you use RocksDB Map State and often iterator over it?
 # How many RocksDB instances per slot? This could be found via how many lines 
of "{{Obtained shared RocksDB cache of size"}} printed within one slot.

> Flink uses more memory than taskmanager.memory.process.size in Kubernetes
> -
>
> Key: FLINK-16267
> URL: https://issues.apache.org/jira/browse/FLINK-16267
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
> Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, 
> oomkilled_taskmanager.log
>
>
> This issue is from 
> [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes]
> h1. Description
>  * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit 
> the resource used by taskmanager to ensure they are not killed by Kubernetes. 
> However, we still get lots of taskmanager `OOMKilled`. The setup is in the 
> following section.
>  * The taskmanager log is in attachment [^oomkilled_taskmanager.log].
> h2. Kubernete
>  * The Kubernetes setup is the same as described in 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html].
>  * The following is resource configuration for taskmanager deployment in 
> Kubernetes:
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
>  {{  limits:}}
>  {{    cpu: 1000m}}
>  {{    memory: 4096Mi}}
> h2. Flink Docker
>  * The Flink docker is built by the following Docker file.
> {{FROM flink:1.10-scala_2.11}}
> RUN mkdir -p /opt/flink/plugins/s3 &&
> ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/
>  {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar 
> /opt/flink/lib/}}
> h2. Flink Configuration
>  * The following are all memory related configurations in `flink-conf.yaml` 
> in 1.10.0:
> {{jobmanager.heap.size: 820m}}
>  {{taskmanager.memory.jvm-metaspace.size: 128m}}
>  {{taskmanager.memory.process.size: 4096m}}
>  * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in 
> `flink-conf.yaml`.
>  ** Use S3 as checkpoint storage.
>  * The code uses DateStream API
>  ** input/output are both Kafka.
> h2. Project Dependencies
>  * The following is our dependencies.
> {{val flinkVersion = "1.10.0"}}{{libraryDependencies += 
> "com.squareup.okhttp3" % "okhttp" % "4.2.2"}}
>  {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}}
>  {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % 
> flinkVersion}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion 
> % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" 
> % flinkVersion % "provided"}}
>  {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % 
> flinkVersion % "provided"}}
>  {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}}
>  {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}}
>  {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}}
> h2. Previous Flink 1.9.1 Configuration
>  * The configuration we used in Flink 1.9.1 are the following. It does not 
> have `OOMKilled`.
> h3. Kubernetes
> {{resources:}}
>  {{  requests:}}
>  {{    cpu: 1200m}}
>  {{    memory: 2G}}
>  {{  limits:}}
>  {{    cpu: 1500m}}
>  {{    memory: 2G}}
> h3. Flink 1.9.1
>

[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373
 ] 

Caizhi Weng edited comment on FLINK-14807 at 2/26/20 3:55 AM:
--

Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can not directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

So I come up with the following design. The new sink will only have 1 
parallelism. The class names and API names below are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the job master by the existing GlobalAggregateManager class.
 # When client want to fetch a portion of the query result, it will contact the 
JobManager.
 # The JobManager receives the RPC call and ask the socket server for results 
with a maximum size.
 # The socket server returns some results to the JobMaster.
 # JobMaster forwards the result to the client.

Some Q for the design above:
 * TM memories are protected by backpressuring. Do we have to introduce a new 
config option to set the maximum memory usage of the sink?
 >> Yes
 * What if the client disconnects / does not connect?
 >> The job will not finish as the sink isn't finished. The sink blocks on the 
 >> invoke method if its memory is full or blocks on the close method if not 
 >> all results have been read by the client. We can also add a timeout check 
 >> to the sink and let the sink fail if client goes away for a certain period 
 >> of time.
 * How to deal with retract / upsert streams?
 >> The return type will be Tuple2 where the first boolean value 
 >> indicates this row is an appending row or a retracting row.
 * Is the 1st step necessary?
 >> Yes, because the port of the socket server is unknown before created.

Some problems to discuss
 * Is the whole design necessary? Why don't we use accumulators to store and 
send results?
 >> Accumulators cannot deal with large results. But apart from this 
 >> accumulators seem to be OK. We can limit the maximum number of rows 
 >> provided by Table#collect and use accumulators to send results to the 
 >> JobMaster with each TM heartbeat. After collecting enough results the 
 >> client can cancel the job. The biggest problem is that for streaming jobs 
 >> we might have to wait until the next heartbeat (which is 10s by default) to 
 >> get the results and decide whether to cancel the job.
 * What if the job restarts?
 >> This is a problem about what kind of API we want to provide.
 ** If we can tolerate an at least once API this does not seem to be a problem. 
We can attach the index and the version (increased with each job restart) of 
each results when sending them back to the client and let the client deal with 
all these version things.
 ** If we want an exactly once API I'm afraid this is very difficult. For batch 
jobs we have to use external storage and for streaming jobs we might have to 
force users to use checkpoints. Backpressures from sink may also affect 
checkpoints.


was (Author: tsreaper):
Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can not directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

So I come up with the following design. The new sink will only have 1 
parallelism. The class names and API names below are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the job master by the existing GlobalAggregateManager class.
 # When client want to fetch a portion of the query result, it will contact the

[jira] [Issue Comment Deleted] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-14807:

Comment: was deleted

(was: Thanks for the comment [~godfreyhe]. I'll list more details about my 
design below.
 * How can sink tell the REST server its address and port?
 >> This is the hardest part of the design. I actually haven't come up with a 
 >> very good solution. I now have three options listed below:
 ** Option A. 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 will introduce Operator Coordinator which can perfectly solve this issue. But 
according to [~jqin] we won't have this feature until 1.11.
 ** Option B. We can use accumulators to tell the job manager the address and 
port of the socket server. But as accumulators are sent with heartbeats and the 
default interval between heartbeats are 10s, this will greatly impact small 
jobs.
 ** Option C. Extract TaskConfig from JobGraph in job manager and insert server 
information into it. This requires the socket server to start in the job 
manager instead of sink, and it seems to be quite hacky...
 * What if a job without ordering restarts?
 >> As streaming jobs are backed by checkpoints this is not a problem. For 
 >> batch jobs I'm afraid we'll have to introduce a special element in the 
 >> resulting iterator indicating that the previous results provided are now 
 >> invalid.)

> Add Table#collect api for fetching data to client
> -
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.9.1
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373
 ] 

Caizhi Weng edited comment on FLINK-14807 at 2/26/20 3:51 AM:
--

Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can not directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

So I come up with the following design. The new sink will only have 1 
parallelism. The class names and API names below are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the job master by the existing GlobalAggregateManager class.
 # When client want to fetch a portion of the query result, it will contact the 
JobManager.
 # The JobManager receives the RPC call and ask the socket server for results 
with a maximum size.
 # The socket server returns some results to the JobMaster.
 # JobMaster forwards the result to the client.

Some Q for the design above:
 * TM memories are protected by backpressuring. Do we have to introduce a new 
config option to set the maximum memory usage of the sink?
 >> Yes
 * What if the client disconnects / does not connect?
 >> The job will not finish as the sink isn't finished. The sink blocks on the 
 >> invoke method if its memory is full or blocks on the close method if not 
 >> all results have been read by the client.
 * How to deal with retract / upsert streams?
 >> The return type will be Tuple2 where the first boolean value 
 >> indicates this row is an appending row or a retracting row.
 * Is the 1st step necessary?
 >> Yes, because the port of the socket server is unknown before created.

Some problems to discuss
 * Is the whole design necessary? Why don't we use accumulators to store and 
send results?
 >> Accumulators cannot deal with large results. But apart from this 
 >> accumulators seem to be OK. We can limit the maximum number of rows 
 >> provided by Table#collect and use accumulators to send results to the 
 >> JobMaster with each TM heartbeat. After collecting enough results the 
 >> client can cancel the job. The biggest problem is that for streaming jobs 
 >> we might have to wait until the next heartbeat (which is 10s by default) to 
 >> get the results and decide whether to cancel the job.
 * What if the job restarts?
 >> This is a problem about what kind of API we want to provide.
 ** If we can tolerate an at least once API this does not seem to be a problem. 
We can attach the index and the version (increased with each job restart) of 
each results when sending them back to the client and let the client deal with 
all these version things.
 ** If we want an exactly once API I'm afraid this is very difficult. For batch 
jobs we have to use external storage and for streaming jobs we might have to 
force users to use checkpoints. Backpressures from sink may also affect 
checkpoints.


was (Author: tsreaper):
Hi dear Flink community,

I'm digging this up because I have some thoughts on implementing this. I'll 
post my thoughts below and I'd love to hear your opinions.

>From my understanding, this problem has two key points. One is that where to 
>store the (possibly) never-ending results, and the other is that task managers 
>can directly communicate with clients under some environments like k8s or 
>yarn. For the never-ending results, back pressuring will work to limit the 
>size of data in the whole cluster. For the communication between task managers 
>and clients, job manager must be the man in the middle as clients and task 
>managers are guaranteed to directly communicate with job manager.

As REST API is the main communication method between clients and job managers, 
I'm going to introduce two new internal REST API and a new sink to do the job. 
The new sink will only have 1 parallelism. The class names and API names below 
are just placeholders.

!table-collect.png|width=600!
 # When the sink is initialized in task managers, it will create a socket 
server for providing the query results. The IP and port of the socket server 
will be given to the REST server by a REST API call.
 # When client want to fetch a portion of the query result, it will provide the 
job id and a token to the REST server. Here

[GitHub] [flink] dianfu commented on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

dianfu commented on issue #11203: [FLINK-16251][python] Optimize the cost of 
function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203#issuecomment-591225209
 
 
   @HuangXingBo Thanks for the PR. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhuzhurk commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order

2020-02-25 Thread GitBox

zhuzhurk commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a 
deterministic iteration order
URL: https://github.com/apache/flink/pull/11187#issuecomment-591225130
 
 
   Thanks for reporting this issue and trying to fix it @cpugputpu .
   I second with @StephanEwen that we should fix the tests instead 
(`testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled` and 
`testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultDisabled`). We 
should not assume that the topological order of sources is the same as the 
order that they are added into the StreamGraph.
   To fix those problematic tests, one option is to find the wanted vertex via 
the name rather than via the index.
   
   @cpugputpu do you want to fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix 
potential NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/150586977) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation

2020-02-25 Thread GitBox

flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the 
cost of function call in ScalarFunctionOpertation
URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739
 
 
   
   ## CI report:
   
   * 66f60932087c6c77fff749c6670f64951db3a40f Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) 
   * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-14807:

Attachment: (was: table-collect.png)

> Add Table#collect api for fetching data to client
> -
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.9.1
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Caizhi Weng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-14807:

Attachment: table-collect.png

> Add Table#collect api for fetching data to client
> -
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.9.1
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client

2020-02-25 Thread Jiangjie Qin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045113#comment-17045113
 ] 

Jiangjie Qin commented on FLINK-14807:
--

We should probably first agree on the right solution to solve this problem and 
then think about the path to achieve that. I believe the Operator Coordinator 
would be the right way to go, for the following reasons:
 # The Operator Coordinator was introduced to provided a bi-direction 
communication mechanism between JM and TM, so it is suitable in this case. And 
in the new Sink design, we will have something like {{SinkCoordinator}} and 
{{SinkWriter}}, just like what we have in the Source.
 # Personally I prefer to have the socket server brought up in the Sink 
Coordinator and then broadcast the port info to the TMs. This simplifies the 
architecture and can remove the restriction of the parallelism on the Sink.

While the SinkCoordinator can solve the communication between JM and TM in both 
control plane via operator events and data plane via a socket server in the 
coordinator, the communication problem between the clients and JM still exists, 
i.e. how to send the records back from JM to the client.

It is true that we can introduce a REST api in JM and let the clients fetch 
records through it, but it would worth thinking if we can have a more generic 
solution that may potentially help in other cases as well. I think it might be 
useful to establish a way for the clients to get information from the Operator 
Coordinators. For example, the REST API in the JM can allow the client to query 
the information of a particular coordinator, and the coordinator can return its 
socket server port back to the client. It is a more flexible and more generic 
solution as in the future, people can have other communication with the 
coordinators without changing any code in the JM.

I also have a questions regarding the failover. When the JM fails over, how can 
the clients connect to the JM after the failover?

> Add Table#collect api for fetching data to client
> -
>
> Key: FLINK-14807
> URL: https://issues.apache.org/jira/browse/FLINK-14807
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.9.1
>Reporter: Jeff Zhang
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
> Attachments: table-collect.png
>
>
> Currently, it is very unconvinient for user to fetch data of flink job unless 
> specify sink expclitly and then fetch data from this sink via its api (e.g. 
> write to hdfs sink, then read data from hdfs). However, most of time user 
> just want to get the data and do whatever processing he want. So it is very 
> necessary for flink to provide api Table#collect for this purpose. 
>  
> Other apis such as Table#head, Table#print is also helpful.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential 
NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639
 
 
   
   ## CI report:
   
   * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] dianfu commented on issue #10995: [FLINK-15847][ml] Include flink-ml-api and flink-ml-lib in opt

2020-02-25 Thread GitBox

dianfu commented on issue #10995: [FLINK-15847][ml] Include flink-ml-api and 
flink-ml-lib in opt
URL: https://github.com/apache/flink/pull/10995#issuecomment-591220343
 
 
   @becketqin Would you like to take a final look at this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function

2020-02-25 Thread GitBox

flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential 
NullPointerException when invoking close() on generated function
URL: https://github.com/apache/flink/pull/11217#issuecomment-591216599
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3b78c9b691fc40bd24f50103b704bf9e1c6396ea (Wed Feb 26 
03:14:15 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16283) NullPointerException in GroupAggProcessFunction.close()

2020-02-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16283:
---
Labels: pull-request-available test-stability  (was: test-stability)

> NullPointerException in GroupAggProcessFunction.close()
> ---
>
> Key: FLINK-16283
> URL: https://issues.apache.org/jira/browse/FLINK-16283
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> CI run: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=5586=logs=b1623ac9-0979-5b0d-2e5e-1377d695c991=48867695-c47f-5af3-2f21-7845611247b9
> {code}
> java.lang.NullPointerException: null
>   at 
> org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.close(GroupAggProcessFunction.scala:182)
>  ~[flink-table_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>  ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>  ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:627)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:565)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:483)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:717) 
> [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541) 
> [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 >

1 - 100 of 439 matches

Mail list logo