[jira] [Comment Edited] (FLINK-16286) Support select from nothing
[ https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045223#comment-17045223 ] Dawid Wysakowicz edited comment on FLINK-16286 at 2/26/20 7:57 AM: --- Yes, I have the same question. This should be supported. I checked with sql client on master(not the newest though) and a query like {{select 1, 'ABC';}} just works. [~zjffdu] Did you try it and had problems with it? was (Author: dawidwys): Yes, I have the same question. This should be supported. I checked with sql client on master(not the newest though) and a query like {{select 1, 'ABC';}} just works. Did you try it and had problems with it? > Support select from nothing > --- > > Key: FLINK-16286 > URL: https://issues.apache.org/jira/browse/FLINK-16286 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.10.0 >Reporter: Jeff Zhang >Priority: Major > > It would be nice to support from noting in flink sql. e.g. > {code:java} > select "name", 1+1, Date() {code} > This is especially useful when user want to try udf provided by others > without creating new table for faked data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module
[ https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-16272: Fix Version/s: statefun-1.1 > Move Stateful Functions end-to-end test framework classes to common module > -- > > Key: FLINK-16272 > URL: https://issues.apache.org/jira/browse/FLINK-16272 > Project: Flink > Issue Type: Task > Components: Stateful Functions, Test Infrastructure >Affects Versions: statefun-1.1 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > Fix For: statefun-1.1 > > Time Spent: 20m > Remaining Estimate: 0h > > There are some end-to-end test utilities that other end-to-end Stateful > Function tests can benefit from, such as: > * {{StatefulFunctionsAppContainers}} > * {{KafkaIOVerifier}} > * {{KafkaProtobufSerDe}} > We should move this to a new common module under > {{statefun-end-to-end-tests}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module
[ https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai closed FLINK-16272. --- Resolution: Fixed Merged to master via 78117b8d96c0ad9b27bf971d3a472c750d84a1d6 > Move Stateful Functions end-to-end test framework classes to common module > -- > > Key: FLINK-16272 > URL: https://issues.apache.org/jira/browse/FLINK-16272 > Project: Flink > Issue Type: Task > Components: Stateful Functions, Test Infrastructure >Affects Versions: statefun-1.1 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > There are some end-to-end test utilities that other end-to-end Stateful > Function tests can benefit from, such as: > * {{StatefulFunctionsAppContainers}} > * {{KafkaIOVerifier}} > * {{KafkaProtobufSerDe}} > We should move this to a new common module under > {{statefun-end-to-end-tests}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16286) Support select from nothing
[ https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045223#comment-17045223 ] Dawid Wysakowicz commented on FLINK-16286: -- Yes, I have the same question. This should be supported. I checked with sql client on master(not the newest though) and a query like {{select 1, 'ABC';}} just works. Did you try it and had problems with it? > Support select from nothing > --- > > Key: FLINK-16286 > URL: https://issues.apache.org/jira/browse/FLINK-16286 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.10.0 >Reporter: Jeff Zhang >Priority: Major > > It would be nice to support from noting in flink sql. e.g. > {code:java} > select "name", 1+1, Date() {code} > This is especially useful when user want to try udf provided by others > without creating new table for faked data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-16244) Add Asynchronous operations to state lazily.
[ https://issues.apache.org/jira/browse/FLINK-16244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai closed FLINK-16244. --- Fix Version/s: statefun-1.1 Resolution: Fixed Merged to master via f367b40e9096e89a7afe04c93b067bb7b3f25a1e > Add Asynchronous operations to state lazily. > > > Key: FLINK-16244 > URL: https://issues.apache.org/jira/browse/FLINK-16244 > Project: Flink > Issue Type: Task > Components: Stateful Functions >Reporter: Igal Shilman >Assignee: Igal Shilman >Priority: Major > Labels: pull-request-available > Fix For: statefun-1.1 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently AsyncSink would add eagerly a registered async operation. > An alternative approach would be to keep the async operations in an in memory > map and only > write them to the underlying map on snapshotState(). > The rational behind this approach is the assumption that most async > operations complete between two consecutive checkpoints, and therefore adding > and removing them from the underlying state backend (rocksdb by default) is > wasteful. > An implementation outline suggestion: > 1. Add a LazyAsyncOperations class that keeps both an in memory map > and a MapStateHandle > this map would support add(), remove() and also flush() > 2. Use that class in AsyncSink and in AsyncMessageDecorator > 3. call flush() from FunctionGroupOperator#snapshotState() > note that a special care should be taken in flash() as the current key needs > to be set > in the keyedStateBackend. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-8093) flink job fail because of kafka producer create fail of "javax.management.InstanceAlreadyExistsException"
[ https://issues.apache.org/jira/browse/FLINK-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045220#comment-17045220 ] Nico Kruber commented on FLINK-8093: Sorry [~bdine] for not seeing your message. I assigned this ticket to you if you are still up for it; please give me a small note whether you still want to work on this or not. > flink job fail because of kafka producer create fail of > "javax.management.InstanceAlreadyExistsException" > - > > Key: FLINK-8093 > URL: https://issues.apache.org/jira/browse/FLINK-8093 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.3.2 > Environment: flink 1.3.2, kafka 0.9.1 >Reporter: dongtingting >Assignee: Bastien DINE >Priority: Critical > > one taskmanager has multiple taskslot, one task fail because of create > kafkaProducer fail,the reason for create kafkaProducer fail is > “javax.management.InstanceAlreadyExistsException: > kafka.producer:type=producer-metrics,client-id=producer-3”。 the detail trace > is : > 2017-11-04 19:41:23,281 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Custom Source -> Filter -> Map -> Filter -> Sink: > dp_client_**_log (7/80) (99551f3f892232d7df5eb9060fa9940c) switched from > RUNNING to FAILED. > org.apache.kafka.common.KafkaException: Failed to construct kafka producer > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:181) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.getKafkaProducer(FlinkKafkaProducerBase.java:202) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.open(FlinkKafkaProducerBase.java:212) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:375) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.kafka.common.KafkaException: Error registering mbean > kafka.producer:type=producer-metrics,client-id=producer-3 > at > org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159) > at > org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77) > at > org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288) > at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:255) > at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:239) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.registerMetrics(RecordAccumulator.java:137) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.(RecordAccumulator.java:111) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:261) > ... 9 more > Caused by: javax.management.InstanceAlreadyExistsException: > kafka.producer:type=producer-metrics,client-id=producer-3 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157) > ... 16 more > I doubt that task in different taskslot of one taskmanager use different > classloader, and taskid may be the same in one process。 So this lead to > create kafkaProducer fail in one taskManager。 > Does anybody encountered the same problem? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-statefun] tzulitai closed pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module
tzulitai closed pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module URL: https://github.com/apache/flink-statefun/pull/34 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink-statefun] tzulitai closed pull request #33: [FLINK-16244] Flush async operations lazily
tzulitai closed pull request #33: [FLINK-16244] Flush async operations lazily URL: https://github.com/apache/flink-statefun/pull/33 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-8093) flink job fail because of kafka producer create fail of "javax.management.InstanceAlreadyExistsException"
[ https://issues.apache.org/jira/browse/FLINK-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber reassigned FLINK-8093: -- Assignee: Bastien DINE > flink job fail because of kafka producer create fail of > "javax.management.InstanceAlreadyExistsException" > - > > Key: FLINK-8093 > URL: https://issues.apache.org/jira/browse/FLINK-8093 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.3.2 > Environment: flink 1.3.2, kafka 0.9.1 >Reporter: dongtingting >Assignee: Bastien DINE >Priority: Critical > > one taskmanager has multiple taskslot, one task fail because of create > kafkaProducer fail,the reason for create kafkaProducer fail is > “javax.management.InstanceAlreadyExistsException: > kafka.producer:type=producer-metrics,client-id=producer-3”。 the detail trace > is : > 2017-11-04 19:41:23,281 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Custom Source -> Filter -> Map -> Filter -> Sink: > dp_client_**_log (7/80) (99551f3f892232d7df5eb9060fa9940c) switched from > RUNNING to FAILED. > org.apache.kafka.common.KafkaException: Failed to construct kafka producer > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:181) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.getKafkaProducer(FlinkKafkaProducerBase.java:202) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.open(FlinkKafkaProducerBase.java:212) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:375) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.kafka.common.KafkaException: Error registering mbean > kafka.producer:type=producer-metrics,client-id=producer-3 > at > org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159) > at > org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77) > at > org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288) > at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:255) > at org.apache.kafka.common.metrics.Metrics.addMetric(Metrics.java:239) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.registerMetrics(RecordAccumulator.java:137) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.(RecordAccumulator.java:111) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:261) > ... 9 more > Caused by: javax.management.InstanceAlreadyExistsException: > kafka.producer:type=producer-metrics,client-id=producer-3 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157) > ... 16 more > I doubt that task in different taskslot of one taskmanager use different > classloader, and taskid may be the same in one process。 So this lead to > create kafkaProducer fail in one taskManager。 > Does anybody encountered the same problem? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.
AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources. URL: https://github.com/apache/flink/pull/11177#discussion_r384319001 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java ## @@ -38,16 +38,46 @@ * To optimize performance, it is generally a good practice to allow maximal * chaining and increase operator parallelism. */ - ALWAYS, + ALWAYS { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return true; + } + }, /** * The operator will not be chained to the preceding or succeeding operators. */ - NEVER, + NEVER { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, /** * The operator will not be chained to the predecessor, but successors may chain to this * operator. */ - HEAD + HEAD { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, + + /** +* Operators will be eagerly chained whenever possible, except after legacy sources. +* +* Operators that will not properly when processInput is called from another thread, must use this strategy +* instead of {@link #ALWAYS}. +*/ + HEAD_AFTER_LEGACY_SOURCE { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return !headOperator.isStreamSource(); + } + }; + + public abstract boolean canBeChainedTo(StreamOperatorFactory headOperator); Review comment: Is yielding the root cause of not chainable to legacy sources? If so, then this is a valid way. I'm concerned though that we also disallow chaining of operators that just need access to the mailboxExecutor for submit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-16282) Wrong exception using DESCRIBE SQL command
[ https://issues.apache.org/jira/browse/FLINK-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045216#comment-17045216 ] Nico Kruber commented on FLINK-16282: - Yes, this ticket is about the error message from trying to use {{DESCRIBE}} > Wrong exception using DESCRIBE SQL command > -- > > Key: FLINK-16282 > URL: https://issues.apache.org/jira/browse/FLINK-16282 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.10.0 >Reporter: Nico Kruber >Priority: Major > > When trying to describe a table like this > {code:java} > Table facttable = tEnv.sqlQuery("DESCRIBE fact_table"); > {code} > currently, you get a strange exception which should rather be a "not > supported" exception > {code} > Exception in thread "main" org.apache.flink.table.api.ValidationException: > SQL validation failed. From line 1, column 10 to line 1, column 19: Column > 'fact_table' not found in any table > at > org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:130) > at > org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105) > at > org.apache.flink.table.sqlexec.SqlToOperationConverter.convert(SqlToOperationConverter.java:124) > at org.apache.flink.table.planner.ParserImpl.parse(ParserImpl.java:66) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:464) > at com.ververica.LateralTableJoin.main(LateralTableJoin.java:92) > Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, > column 10 to line 1, column 19: Column 'fact_table' not found in any table > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) > at > org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463) > at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:834) > at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:819) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4841) > at > org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:259) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateIdentifier(SqlValidatorImpl.java:2943) > at > org.apache.calcite.sql.SqlIdentifier.validateExpr(SqlIdentifier.java:297) > at org.apache.calcite.sql.SqlOperator.validateCall(SqlOperator.java:407) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateCall(SqlValidatorImpl.java:5304) > at org.apache.calcite.sql.SqlCall.validate(SqlCall.java:116) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:943) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:650) > at > org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:126) > ... 5 more > Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Column > 'fact_table' not found in any table > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) > at > org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463) > at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572) > ... 17 more > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 28cbb454934766dd36723d24c2e490282778d83d Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/15051) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5599) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704 ## CI report: * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150593392) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.
AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources. URL: https://github.com/apache/flink/pull/11177#discussion_r384320045 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java ## @@ -602,9 +602,8 @@ public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) { && outOperator != null && headOperator != null && upStreamVertex.isSameSlotSharingGroup(downStreamVertex) - && outOperator.getChainingStrategy() == ChainingStrategy.ALWAYS - && (headOperator.getChainingStrategy() == ChainingStrategy.HEAD || - headOperator.getChainingStrategy() == ChainingStrategy.ALWAYS) + && outOperator.getChainingStrategy().canBeChainedTo(headOperator) + && headOperator.getChainingStrategy() != ChainingStrategy.NEVER Review comment: A different question imho. First, is upstream supporting chaining at all? Second, is this specific downstream chainable to the upstream. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.
AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources. URL: https://github.com/apache/flink/pull/11177#discussion_r384319679 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java ## @@ -38,16 +38,46 @@ * To optimize performance, it is generally a good practice to allow maximal * chaining and increase operator parallelism. */ - ALWAYS, + ALWAYS { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { Review comment: Renamed vars in a separate hotfix. For the remainder naming, I think it's the other way around. If I construct head to tail, I'd ask can this node by chained to the head? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407 ## CI report: * a69f034cd1be52b9d8e2553fef1263f22dc4fb66 Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150404032) * 1d751512b76dfaa57ec1f8f0519c36dd60e47cff UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.
AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources. URL: https://github.com/apache/flink/pull/11177#discussion_r384319001 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java ## @@ -38,16 +38,46 @@ * To optimize performance, it is generally a good practice to allow maximal * chaining and increase operator parallelism. */ - ALWAYS, + ALWAYS { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return true; + } + }, /** * The operator will not be chained to the preceding or succeeding operators. */ - NEVER, + NEVER { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, /** * The operator will not be chained to the predecessor, but successors may chain to this * operator. */ - HEAD + HEAD { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, + + /** +* Operators will be eagerly chained whenever possible, except after legacy sources. +* +* Operators that will not properly when processInput is called from another thread, must use this strategy +* instead of {@link #ALWAYS}. +*/ + HEAD_AFTER_LEGACY_SOURCE { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return !headOperator.isStreamSource(); + } + }; + + public abstract boolean canBeChainedTo(StreamOperatorFactory headOperator); Review comment: Is yielding the root cause of not chainable to legacy sources? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink-statefun] tzulitai commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module
tzulitai commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module URL: https://github.com/apache/flink-statefun/pull/34#issuecomment-591285761 Thanks! Merging ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-16144) Add client.timeout setting and use that for CLI operations
[ https://issues.apache.org/jira/browse/FLINK-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045214#comment-17045214 ] Khaireddine Rezgui commented on FLINK-16144: Hi [~sewen], can you assign this task to me :) ? Thanks. > Add client.timeout setting and use that for CLI operations > -- > > Key: FLINK-16144 > URL: https://issues.apache.org/jira/browse/FLINK-16144 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Reporter: Aljoscha Krettek >Priority: Major > Fix For: 1.11.0 > > > Currently, the Cli uses the {{akka.client.timeout}} setting. This has > historical reasons but can be very confusing for users. We should introduce a > new setting {{client.timeout}} that is used for the client, with a fallback > to the previous {{akka.client.timeout}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources.
AHeise commented on a change in pull request #11177: [FLINK-16219][runtime] Made AsyncWaitOperator chainable to non-sources. URL: https://github.com/apache/flink/pull/11177#discussion_r384318641 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/ChainingStrategy.java ## @@ -38,16 +38,46 @@ * To optimize performance, it is generally a good practice to allow maximal * chaining and increase operator parallelism. */ - ALWAYS, + ALWAYS { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return true; + } + }, /** * The operator will not be chained to the preceding or succeeding operators. */ - NEVER, + NEVER { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, /** * The operator will not be chained to the predecessor, but successors may chain to this * operator. */ - HEAD + HEAD { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return false; + } + }, + + /** +* Operators will be eagerly chained whenever possible, except after legacy sources. +* +* Operators that will not properly when processInput is called from another thread, must use this strategy +* instead of {@link #ALWAYS}. +*/ + HEAD_AFTER_LEGACY_SOURCE { + @Override + public boolean canBeChainedTo(StreamOperatorFactory headOperator) { + return !headOperator.isStreamSource(); + } + }; + + public abstract boolean canBeChainedTo(StreamOperatorFactory headOperator); Review comment: It was not. Good catch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes
[ https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045213#comment-17045213 ] ChangZhuo Chen (陳昌倬) commented on FLINK-16267: -- Hi [~liyu], * We has setup `state.backend.rocksdb.memory.managed: false` to see if problem remains. * For memory setting, we setup a dashboard to monitor JVM Heap usage (Prometheus metric name: flink_taskmanager_Status_JVM_Memory_Heap_Used), and we found that the memory usage is low compare to k8s resources configuration, so: ** In 1.9.1, we has increased `taskmanager.heap.size` from `1024m` (default from template) to `2000m` to see if throughput can be improved. ** In 1.10.0, we has increased `taskmanager.memory.process.size` to match k8s resource configuration (4G), and start to lower the value due to `OOMKilled`. Hi [~yunta], * We has setup `state.backend.rocksdb.metrics.block-cache-usage: true` and will scrape statistic afterward. * We 1 BroadcastProcessFunction, and 1 KeyedProcessFunction with the following states: ** BroadcastProcessFunction *** 1 ListState ** KeyedProcessFunction *** 1 ValueState *** 3 MapState * For 1 MapState in KeyedProcessFunction, we do iterate it oftenly as part of our business logic. * As for `How many RocksDB instances per slot?`, we will update when we get the answer from log. > Flink uses more memory than taskmanager.memory.process.size in Kubernetes > - > > Key: FLINK-16267 > URL: https://issues.apache.org/jira/browse/FLINK-16267 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.10.0 >Reporter: ChangZhuo Chen (陳昌倬) >Priority: Major > Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, > oomkilled_taskmanager.log > > > This issue is from > [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes] > h1. Description > * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit > the resource used by taskmanager to ensure they are not killed by Kubernetes. > However, we still get lots of taskmanager `OOMKilled`. The setup is in the > following section. > * The taskmanager log is in attachment [^oomkilled_taskmanager.log]. > h2. Kubernete > * The Kubernetes setup is the same as described in > [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html]. > * The following is resource configuration for taskmanager deployment in > Kubernetes: > {{resources:}} > {{ requests:}} > {{ cpu: 1000m}} > {{ memory: 4096Mi}} > {{ limits:}} > {{ cpu: 1000m}} > {{ memory: 4096Mi}} > h2. Flink Docker > * The Flink docker is built by the following Docker file. > {{FROM flink:1.10-scala_2.11}} > RUN mkdir -p /opt/flink/plugins/s3 && > ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/ > {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar > /opt/flink/lib/}} > h2. Flink Configuration > * The following are all memory related configurations in `flink-conf.yaml` > in 1.10.0: > {{jobmanager.heap.size: 820m}} > {{taskmanager.memory.jvm-metaspace.size: 128m}} > {{taskmanager.memory.process.size: 4096m}} > * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in > `flink-conf.yaml`. > ** Use S3 as checkpoint storage. > * The code uses DateStream API > ** input/output are both Kafka. > h2. Project Dependencies > * The following is our dependencies. > {{val flinkVersion = "1.10.0"}}{{libraryDependencies += > "com.squareup.okhttp3" % "okhttp" % "4.2.2"}} > {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}} > {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}} > {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % > flinkVersion}} > {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % > flinkVersion}} > {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion > % "provided"}} > {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" > % flinkVersion % "provided"}} > {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % > flinkVersion % "provided"}} > {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}} > {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}} > {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}} > h2. Previous Flink 1.9.1 Configuration > * The configuration we used in Flink 1.9.1 are the following. It does not > have `OOMKilled`. > h3. Kubernetes > {{resources:}} > {{ requests:}} > {{ cpu: 1200m}} > {{ memory: 2G}} > {{ limits:}} > {{ cpu: 1500m}} > {{ memory: 2G}} > h3. Flink 1.9.1 > {{jobmanager.heap.size:
[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565 ## CI report: * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150597282) Azure: [SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5598) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-16286) Support select from nothing
[ https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045203#comment-17045203 ] Kurt Young commented on FLINK-16286: Isn't this already supported? > Support select from nothing > --- > > Key: FLINK-16286 > URL: https://issues.apache.org/jira/browse/FLINK-16286 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.10.0 >Reporter: Jeff Zhang >Priority: Major > > It would be nice to support from noting in flink sql. e.g. > {code:java} > select "name", 1+1, Date() {code} > This is especially useful when user want to try udf provided by others > without creating new table for faked data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhijiangW commented on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.
zhijiangW commented on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment. URL: https://github.com/apache/flink/pull/11155#issuecomment-591277621 @wsry I also updated the benchmark results with the same connection amount in [link](https://github.com/apache/flink/pull/11155#issuecomment-590753152). The conclusion is that all the cases for network throughput have regression except for 100, 100ms. I am not sure whether we can explain this corner case properly. But I guess it would not block merging since it should be the right way to go. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#issuecomment-589965444 ## CI report: * 0cc972cf542776a79767c119b111b3fa2531b9b9 Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150601347) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565 ## CI report: * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150597282) Azure: [SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5598) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704 ## CI report: * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150593392) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink-statefun] igalshilman commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module
igalshilman commented on issue #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module URL: https://github.com/apache/flink-statefun/pull/34#issuecomment-591276273 Thanks, LGTM! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] PengTaoWW commented on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
PengTaoWW commented on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-591276048 @flinkbot run travis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.
zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment. URL: https://github.com/apache/flink/pull/11155#issuecomment-590753152 Also attach the comparison results executed in benchmark machine: Baseline | | -- | -- | -- DataSkewStreamNetworkThroughputBenchmarkExecutor| 51977.13475 | StreamNetworkBroadcastThroughputBenchmarkExecutor| 1164.463375 | StreamNetworkThroughputBenchmarkExecutor| 69553.22069 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 17430.11747 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 51019.51607 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 55423.23278 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 14924.65899 | 1000,100ms,SSL less gates with less connections | | DataSkewStreamNetworkThroughputBenchmarkExecutor| 33215.6767 | StreamNetworkBroadcastThroughputBenchmarkExecutor| 1189.239694 | StreamNetworkThroughputBenchmarkExecutor| 72450.69576 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 17018.46785 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 45829.91493 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 52943.20546 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 13762.96631 | 1000,100ms,SSL less gates with same connections | | DataSkewStreamNetworkThroughputBenchmarkExecutor| 32342.12887| StreamNetworkBroadcastThroughputBenchmarkExecutor| 1136.44912 | StreamNetworkThroughputBenchmarkExecutor| 75151.951622 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 16878.544466 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 47041.479598 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 52384.517795 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 13824.809088 | 1000,100ms,SSL This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment.
zhijiangW edited a comment on issue #11155: [FLINK-14818][benchmark] Fix receiving InputGate setup of StreamNetworkBenchmarkEnvironment. URL: https://github.com/apache/flink/pull/11155#issuecomment-590753152 Also attach the comparison results executed in benchmark machine: Baseline | | -- | -- | -- DataSkewStreamNetworkThroughputBenchmarkExecutor| 51977.13475 | StreamNetworkBroadcastThroughputBenchmarkExecutor| 1164.463375 | StreamNetworkThroughputBenchmarkExecutor| 69553.22069 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 17430.11747 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 51019.51607 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 55423.23278 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 14924.65899 | 1000,100ms,SSL less gates with less connections | | DataSkewStreamNetworkThroughputBenchmarkExecutor| 33215.6767 | StreamNetworkBroadcastThroughputBenchmarkExecutor| 1189.239694 | StreamNetworkThroughputBenchmarkExecutor| 72450.69576 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 17018.46785 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 45829.91493 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 52943.20546 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 13762.96631 | 1000,100ms,SSL less gates with same connections | | DataSkewStreamNetworkThroughputBenchmarkExecutor| 32342.12887| StreamNetworkBroadcastThroughputBenchmarkExecutor| 1136.44912 | StreamNetworkThroughputBenchmarkExecutor| 75151.951622 | 100,100ms StreamNetworkThroughputBenchmarkExecutor| 16878.544466 | 100,100ms,SSL StreamNetworkThroughputBenchmarkExecutor| 47041.479598 | 1000,1ms StreamNetworkThroughputBenchmarkExecutor| 52384.517795 | 1000,100ms StreamNetworkThroughputBenchmarkExecutor| 13824.809088 | 1000,100ms,SSL This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#discussion_r384304115 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java ## @@ -785,10 +785,19 @@ public void testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled() final List verticesSorted = jobGraph.getVerticesSortedTopologicallyFromSources(); assertEquals(4, verticesSorted.size()); - final JobVertex source1Vertex = verticesSorted.get(0); - final JobVertex source2Vertex = verticesSorted.get(1); - final JobVertex map1Vertex = verticesSorted.get(2); - final JobVertex map2Vertex = verticesSorted.get(3); + JobVertex source1Vertex = null, source2Vertex = null, map1Vertex = null, map2Vertex= null; + for (int i = 0; i < 4; i++) + { Review comment: The left brace should be in the same line with the for statement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#discussion_r384305119 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java ## @@ -785,10 +785,19 @@ public void testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled() final List verticesSorted = jobGraph.getVerticesSortedTopologicallyFromSources(); assertEquals(4, verticesSorted.size()); - final JobVertex source1Vertex = verticesSorted.get(0); - final JobVertex source2Vertex = verticesSorted.get(1); - final JobVertex map1Vertex = verticesSorted.get(2); - final JobVertex map2Vertex = verticesSorted.get(3); + JobVertex source1Vertex = null, source2Vertex = null, map1Vertex = null, map2Vertex= null; + for (int i = 0; i < 4; i++) + { + JobVertex vertex = verticesSorted.get(i); + if (vertex.getName().equals("Source: source1")) Review comment: I'd prefer to change the matching to `if (vertex.getName().contains("source1"))`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#discussion_r384304446 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java ## @@ -785,10 +785,19 @@ public void testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled() final List verticesSorted = jobGraph.getVerticesSortedTopologicallyFromSources(); assertEquals(4, verticesSorted.size()); - final JobVertex source1Vertex = verticesSorted.get(0); - final JobVertex source2Vertex = verticesSorted.get(1); - final JobVertex map1Vertex = verticesSorted.get(2); - final JobVertex map2Vertex = verticesSorted.get(3); + JobVertex source1Vertex = null, source2Vertex = null, map1Vertex = null, map2Vertex= null; + for (int i = 0; i < 4; i++) + { + JobVertex vertex = verticesSorted.get(i); + if (vertex.getName().equals("Source: source1")) Review comment: if body should always be surrounded by braces. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#discussion_r384303864 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java ## @@ -785,10 +785,19 @@ public void testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled() final List verticesSorted = jobGraph.getVerticesSortedTopologicallyFromSources(); assertEquals(4, verticesSorted.size()); - final JobVertex source1Vertex = verticesSorted.get(0); - final JobVertex source2Vertex = verticesSorted.get(1); - final JobVertex map1Vertex = verticesSorted.get(2); - final JobVertex map2Vertex = verticesSorted.get(3); + JobVertex source1Vertex = null, source2Vertex = null, map1Vertex = null, map2Vertex= null; Review comment: we should not declare multiple variables in one line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on a change in pull request #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#discussion_r384305514 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java ## @@ -805,10 +814,19 @@ public void testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultDisabled( final List verticesSorted = jobGraph.getVerticesSortedTopologicallyFromSources(); assertEquals(4, verticesSorted.size()); - final JobVertex source1Vertex = verticesSorted.get(0); - final JobVertex source2Vertex = verticesSorted.get(1); - final JobVertex map1Vertex = verticesSorted.get(2); - final JobVertex map2Vertex = verticesSorted.get(3); + JobVertex source1Vertex = null, source2Vertex = null, map1Vertex = null, map2Vertex= null; + for (int i = 0; i < 4; i++) Review comment: we could extract this matching block to a common method to avoid duplication. It can return a list of JobVertices in the expected order. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298074 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.common + +import org.apache.calcite.rel.`type`.RelDataType +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.api.dag.Transformation +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.streaming.api.transformations.OneInputTransformation +import org.apache.flink.table.dataformat.BaseRow +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.planner.calcite.FlinkTypeFactory +import org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan +import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( Review comment: protected? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
flinkbot edited a comment on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#issuecomment-589965444 ## CI report: * 7f887f58c7bd62dbc531b5e554555d1f409b3981 Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150152750) * 0cc972cf542776a79767c119b111b3fa2531b9b9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298384 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.plan.nodes + +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.runtime.types.CRow +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( Review comment: private? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298406 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.plan.nodes + +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.runtime.types.CRow +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( + config: Configuration, + inputRowType: RowType, + outputRowType: RowType, + pythonFunctionInfo: PythonFunctionInfo, + udtfInputOffsets: Array[Int], + joinType: JoinRelType): OneInputStreamOperator[CRow, CRow] = { +val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME) +val ctor = clazz.getConstructor( + classOf[Configuration], + classOf[PythonFunctionInfo], + classOf[RowType], + classOf[RowType], + classOf[Array[Int]], + classOf[JoinRelType]) +ctor.newInstance( + config, + pythonFunctionInfo, + inputRowType, + outputRowType, + udtfInputOffsets, + joinType) + .asInstanceOf[OneInputStreamOperator[CRow, CRow]] + } + + private[flink] def extractPythonTableFunctionInfo( Review comment: private? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594) * 28cbb454934766dd36723d24c2e490282778d83d Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/15051) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298213 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.common + +import org.apache.calcite.rel.`type`.RelDataType +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.api.dag.Transformation +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.streaming.api.transformations.OneInputTransformation +import org.apache.flink.table.dataformat.BaseRow +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.planner.calcite.FlinkTypeFactory +import org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan +import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( + config: Configuration, + inputRowType: BaseRowTypeInfo, + outputRowType: BaseRowTypeInfo, + pythonFunctionInfo: PythonFunctionInfo, + udtfInputOffsets: Array[Int], + joinType: JoinRelType): OneInputStreamOperator[BaseRow, BaseRow] = { +val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME) +val ctor = clazz.getConstructor( + classOf[Configuration], + classOf[PythonFunctionInfo], + classOf[RowType], + classOf[RowType], + classOf[Array[Int]], + classOf[JoinRelType]) +ctor.newInstance( + config, + pythonFunctionInfo, + inputRowType.toRowType, + outputRowType.toRowType, + udtfInputOffsets, + joinType) + .asInstanceOf[OneInputStreamOperator[BaseRow, BaseRow]] + } + + private[flink] def extractPythonTableFunctionInfo( Review comment: protected? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16286) Support select from nothing
[ https://issues.apache.org/jira/browse/FLINK-16286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated FLINK-16286: --- Component/s: Table SQL / API > Support select from nothing > --- > > Key: FLINK-16286 > URL: https://issues.apache.org/jira/browse/FLINK-16286 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.10.0 >Reporter: Jeff Zhang >Priority: Major > > It would be nice to support from noting in flink sql. e.g. > {code:java} > select "name", 1+1, Date() {code} > This is especially useful when user want to try udf provided by others > without creating new table for faked data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-16286) Support select from nothing
Jeff Zhang created FLINK-16286: -- Summary: Support select from nothing Key: FLINK-16286 URL: https://issues.apache.org/jira/browse/FLINK-16286 Project: Flink Issue Type: New Feature Affects Versions: 1.10.0 Reporter: Jeff Zhang It would be nice to support from noting in flink sql. e.g. {code:java} select "name", 1+1, Date() {code} This is especially useful when user want to try udf provided by others without creating new table for faked data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-16251) Optimize the cost of function call in ScalarFunctionOpertation
[ https://issues.apache.org/jira/browse/FLINK-16251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-16251. --- Resolution: Fixed Merged to master via 1ada2997254d08b0baa15484d68b93250098289c > Optimize the cost of function call in ScalarFunctionOpertation > --- > > Key: FLINK-16251 > URL: https://issues.apache.org/jira/browse/FLINK-16251 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: Huang Xingbo >Assignee: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, there are too many extra function calls cost in > ScalarFunctionOpertation.We need to optimize it to improve performance in > Python UDF. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dianfu merged pull request #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
dianfu merged pull request #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594) * 28cbb454934766dd36723d24c2e490282778d83d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] cpugputpu commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
cpugputpu commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#issuecomment-591267279 Thanks for your valuable comments! I am sorry that my previous fix was not a good one. Now I fix it without changing StreamGraph.java. @flinkbot run travis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298406 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.plan.nodes + +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.runtime.types.CRow +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( + config: Configuration, + inputRowType: RowType, + outputRowType: RowType, + pythonFunctionInfo: PythonFunctionInfo, + udtfInputOffsets: Array[Int], + joinType: JoinRelType): OneInputStreamOperator[CRow, CRow] = { +val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME) +val ctor = clazz.getConstructor( + classOf[Configuration], + classOf[PythonFunctionInfo], + classOf[RowType], + classOf[RowType], + classOf[Array[Int]], + classOf[JoinRelType]) +ctor.newInstance( + config, + pythonFunctionInfo, + inputRowType, + outputRowType, + udtfInputOffsets, + joinType) + .asInstanceOf[OneInputStreamOperator[CRow, CRow]] + } + + private[flink] def extractPythonTableFunctionInfo( Review comment: private? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298384 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/plan/nodes/CommonPythonCorrelate.scala ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.plan.nodes + +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.plan.nodes.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.runtime.types.CRow +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( Review comment: private? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298213 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.common + +import org.apache.calcite.rel.`type`.RelDataType +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.api.dag.Transformation +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.streaming.api.transformations.OneInputTransformation +import org.apache.flink.table.dataformat.BaseRow +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.planner.calcite.FlinkTypeFactory +import org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan +import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( + config: Configuration, + inputRowType: BaseRowTypeInfo, + outputRowType: BaseRowTypeInfo, + pythonFunctionInfo: PythonFunctionInfo, + udtfInputOffsets: Array[Int], + joinType: JoinRelType): OneInputStreamOperator[BaseRow, BaseRow] = { +val clazz = loadClass(PYTHON_TABLE_FUNCTION_OPERATOR_NAME) +val ctor = clazz.getConstructor( + classOf[Configuration], + classOf[PythonFunctionInfo], + classOf[RowType], + classOf[RowType], + classOf[Array[Int]], + classOf[JoinRelType]) +ctor.newInstance( + config, + pythonFunctionInfo, + inputRowType.toRowType, + outputRowType.toRowType, + udtfInputOffsets, + joinType) + .asInstanceOf[OneInputStreamOperator[BaseRow, BaseRow]] + } + + private[flink] def extractPythonTableFunctionInfo( Review comment: protected? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of
HuangXingBo commented on a change in pull request #11130: [FLINK-15972][python][table-planner][table-planner-blink] Add Python building blocks to make sure the basic functionality of Python TableFunction could work URL: https://github.com/apache/flink/pull/11130#discussion_r384298074 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/plan/nodes/common/CommonPythonCorrelate.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.common + +import org.apache.calcite.rel.`type`.RelDataType +import org.apache.calcite.rel.core.JoinRelType +import org.apache.calcite.rex.{RexCall, RexInputRef, RexNode} +import org.apache.flink.api.dag.Transformation +import org.apache.flink.configuration.Configuration +import org.apache.flink.streaming.api.operators.OneInputStreamOperator +import org.apache.flink.streaming.api.transformations.OneInputTransformation +import org.apache.flink.table.dataformat.BaseRow +import org.apache.flink.table.functions.python.PythonFunctionInfo +import org.apache.flink.table.planner.calcite.FlinkTypeFactory +import org.apache.flink.table.planner.plan.nodes.common.CommonPythonCorrelate.PYTHON_TABLE_FUNCTION_OPERATOR_NAME +import org.apache.flink.table.planner.plan.nodes.logical.FlinkLogicalTableFunctionScan +import org.apache.flink.table.runtime.typeutils.BaseRowTypeInfo +import org.apache.flink.table.types.logical.RowType + +import scala.collection.mutable + +trait CommonPythonCorrelate extends CommonPythonBase { + def getPythonTableFunctionOperator( Review comment: protected? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
flinkbot edited a comment on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565 ## CI report: * 75bddc5732f99c88a183b683660a85202ac8c5cb Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150597282) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
wuchong commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591261168 @JingsongLi , good catch! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219#issuecomment-591257565 ## CI report: * 75bddc5732f99c88a183b683660a85202ac8c5cb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
flinkbot edited a comment on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument URL: https://github.com/apache/flink/pull/11218#issuecomment-591253355 ## CI report: * 7b8f569390e60c187f3282763813b44a43ff9cc9 Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150596303) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16274) Add typed builder methods for setting dynamic configuration on StatefulFunctionsAppContainers
[ https://issues.apache.org/jira/browse/FLINK-16274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-16274: --- Labels: pull-request-available (was: ) > Add typed builder methods for setting dynamic configuration on > StatefulFunctionsAppContainers > - > > Key: FLINK-16274 > URL: https://issues.apache.org/jira/browse/FLINK-16274 > Project: Flink > Issue Type: New Feature > Components: Stateful Functions, Test Infrastructure >Affects Versions: statefun-1.1 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > > Excerpt from: > https://github.com/apache/flink-statefun/pull/32#discussion_r383644382 > Currently, you'd need to provide a complete {{Configuration}} as dynamic > properties when constructing a {{StatefulFunctionsAppContainers}}. > It'll be nicer if this is built like this: > {code} > public StatefulFunctionsAppContainers verificationApp = > new StatefulFunctionsAppContainers("sanity-verification", 2) > .withModuleGlobalConfiguration("kafka-broker", > kafka.getBootstrapServers()) > .withConfiguration(ConfigOption option, configValue) > {code} > And by default the {{StatefulFunctionsAppContainers}} just only has the > configs in the base template {{flink-conf.yaml}}. > This would require lazy construction of the containers on {{beforeTest}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-statefun] tzulitai opened a new pull request #35: [FLINK-16274] Support inlined configuration in StatefulFunctionsAppContainers
tzulitai opened a new pull request #35: [FLINK-16274] Support inlined configuration in StatefulFunctionsAppContainers URL: https://github.com/apache/flink-statefun/pull/35 This PR is based on #34. The first 4 commits are not relevant. --- ### Goal This PR achieves the following things: - Change the `StatefulFunctionsAppContainers` to a more "builder-ish" approach, which the user provides settings on how the containers are built, and only on `before()` the test we build the containers and application images. This would allow a more fluent API to inline configuration settings. - Add inline configuration setting methods, like so: ``` @Rule public StatefulFunctionsAppContainers verificationApp = new StatefulFunctionsAppContainers("app-name", numWorkers) .withModuleGlobalConfiguration("key", "value"); .withConfiguration(ConfigOption option, T value); ``` This also eliminates the need for users of `StatefulFunctionsAppContainers` to maintain a `flink-conf.yaml` in their classpath, as they can now build the configuration fully programmatically with the new inline configuration methods. --- ### Verifying This is only changes in tests. Run `mvn clean verify -Prun-e2e-tests` to verify changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build
hwanju commented on issue #11196: [FLINK-16246][connector kinesis] Exclude AWS SDK MBean registry from Kinesis build URL: https://github.com/apache/flink/pull/11196#issuecomment-591253938 Although I am also not sure what would be the ramification of excluding`SdkMBeanRegistrySupport` to other depending classes, it seems like no-op replacement seems safe and clever. Related to `FlinkKinesisProducer#close()` solution, would it be viable to include `AwsSdkMetrics#unregisterMetricAdminMbean()` inside `BlobLibraryCacheManager#releaseClassLoader` to get that unregistered once all the tasks on the job are closed? But it seems too specific call. At any rate, we may not make sure that removing this can resolve the leak, right? (if I am reading FLINK-16142 correctly). AFAIR, KPL also has statically launched daemon thread (file age manager), which is not shut down on task close and holds on to the class loaded by user class loader, leading to leak. In this case, not sure how just removing this MBean registry would get things better. We've seen OOM killed with unlimited metaspace, but it seems like it hits metaspace OOM before eating up all the memory as of 1.10 (which we have not run yet). Curious if you have measured number of class loaders before/after the fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument URL: https://github.com/apache/flink/pull/11218#issuecomment-591253355 ## CI report: * 7b8f569390e60c187f3282763813b44a43ff9cc9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
flinkbot commented on issue #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219#issuecomment-591252807 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 75bddc5732f99c88a183b683660a85202ac8c5cb (Wed Feb 26 05:51:09 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16257) Remove useless ResultPartitionID from AddCredit message
[ https://issues.apache.org/jira/browse/FLINK-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-16257: --- Labels: pull-request-available (was: ) > Remove useless ResultPartitionID from AddCredit message > --- > > Key: FLINK-16257 > URL: https://issues.apache.org/jira/browse/FLINK-16257 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Zhijiang >Assignee: Zhijiang >Priority: Minor > Labels: pull-request-available > Fix For: 1.11.0 > > > The ResultPartitionID in AddCredit message is never used on upstream side, so > we can remove it to cleanup the codes. There would have another two benefits > to do so: > # Reduce the total message size from previous 52 bytes to 20 bytes. > # Decouple the dependency with `InputChannel#getPartitionId` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhijiangW opened a new pull request #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message
zhijiangW opened a new pull request #11219: [FLINK-16257][network] Remove useless ResultPartitionID from AddCredit message URL: https://github.com/apache/flink/pull/11219 ## What is the purpose of the change The `ResultPartitionID` in AddCredit message is never used on upstream side, so we can remove it to cleanup the codes. There would have another two benefits to do so: 1. Reduce the total message size from previous 52 bytes to 20 bytes. 2. Decouple the dependency with `InputChannel#getPartitionId`. ## Brief change log - *Remove `ResultPartitionID` from `AddCredit` message* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16272) Move Stateful Functions end-to-end test framework classes to common module
[ https://issues.apache.org/jira/browse/FLINK-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-16272: --- Labels: pull-request-available (was: ) > Move Stateful Functions end-to-end test framework classes to common module > -- > > Key: FLINK-16272 > URL: https://issues.apache.org/jira/browse/FLINK-16272 > Project: Flink > Issue Type: Task > Components: Stateful Functions, Test Infrastructure >Affects Versions: statefun-1.1 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > > There are some end-to-end test utilities that other end-to-end Stateful > Function tests can benefit from, such as: > * {{StatefulFunctionsAppContainers}} > * {{KafkaIOVerifier}} > * {{KafkaProtobufSerDe}} > We should move this to a new common module under > {{statefun-end-to-end-tests}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-statefun] tzulitai opened a new pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module
tzulitai opened a new pull request #34: [FLINK-16272] Move e2e test utilities to a statefun-e2e-tests-common module URL: https://github.com/apache/flink-statefun/pull/34 This PR achieves 2 simple things: - Move `StatefulFunctionsAppContainers` / `KafkaIOVerifier` / `ProtobufSerDe` to a common `statefun-e2e-tests-common` module so that new e2e tests can use them. - Rename all e2e related classes / packages / modules to `*e2e*`. End-to-end test modules living under `statefun-e2e-tests` should now be named `statefun-*-e2e`, and test classes should be named `*E2E.java`. --- ### Verifying Run `mvn clean verify -Prun-e2e-tests` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
[ https://issues.apache.org/jira/browse/FLINK-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-16285: - Fix Version/s: 1.11.0 > Refactor SingleInputGate#setInputChannel to remove > IntermediateResultPartitionID argument > - > > Key: FLINK-16285 > URL: https://issues.apache.org/jira/browse/FLINK-16285 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Zhijiang >Assignee: Zhijiang >Priority: Minor > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The IntermediateResultPartitionID info can be got directly from the > respective InputChannel, so we can remove it from the arguments of > SingleInputGate#setInputChannel to cleanup the codes. > It is also helpful to simplify the unit tests and avoid passing the > inconsistent IntermediateResultPartitionID with the internal > ResultPartitionID that the respective InputChannel maintains. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045155#comment-17045155 ] Kurt Young commented on FLINK-14807: A side comment: if we want this API fully deal with fault tolerance and consistency, it's equivalent to implement an end-to-end exactly once support sink. > Add Table#collect api for fetching data to client > - > > Key: FLINK-14807 > URL: https://issues.apache.org/jira/browse/FLINK-14807 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.1 >Reporter: Jeff Zhang >Priority: Major > Labels: usability > Fix For: 1.11.0 > > Attachments: table-collect.png > > > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > > Other apis such as Table#head, Table#print is also helpful. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
flinkbot commented on issue #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument URL: https://github.com/apache/flink/pull/11218#issuecomment-591248775 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit bb563801b8d0724c79847ed62c8a7add2043c48e (Wed Feb 26 05:39:44 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
[ https://issues.apache.org/jira/browse/FLINK-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-16285: --- Labels: pull-request-available (was: ) > Refactor SingleInputGate#setInputChannel to remove > IntermediateResultPartitionID argument > - > > Key: FLINK-16285 > URL: https://issues.apache.org/jira/browse/FLINK-16285 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Zhijiang >Assignee: Zhijiang >Priority: Minor > Labels: pull-request-available > > The IntermediateResultPartitionID info can be got directly from the > respective InputChannel, so we can remove it from the arguments of > SingleInputGate#setInputChannel to cleanup the codes. > It is also helpful to simplify the unit tests and avoid passing the > inconsistent IntermediateResultPartitionID with the internal > ResultPartitionID that the respective InputChannel maintains. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhijiangW opened a new pull request #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
zhijiangW opened a new pull request #11218: [FLINK-16285][network] Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument URL: https://github.com/apache/flink/pull/11218 ## What is the purpose of the change *The `IntermediateResultPartitionID` info can be got directly from the respective `InputChannel`, so we can remove it from the arguments of `SingleInputGate#setInputChannel` to cleanup the codes.* *It is also helpful to simplify the unit tests and avoid passing the inconsistent `IntermediateResultPartitionID` with the internal ResultPartitionID that the respective `InputChannel` maintains.* ## Brief change log - *Remove the argument of `IntermediateResultPartitionID` from `SingleInputGate#setInputChannel ` - *Adjust the related unit tests* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704 ## CI report: * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150593392) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-16285) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
Zhijiang created FLINK-16285: Summary: Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument Key: FLINK-16285 URL: https://issues.apache.org/jira/browse/FLINK-16285 Project: Flink Issue Type: Improvement Components: Runtime / Network Reporter: Zhijiang Assignee: Zhijiang The IntermediateResultPartitionID info can be got directly from the respective InputChannel, so we can remove it from the arguments of SingleInputGate#setInputChannel to cleanup the codes. It is also helpful to simplify the unit tests and avoid passing the inconsistent IntermediateResultPartitionID with the internal ResultPartitionID that the respective InputChannel maintains. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-16284) Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument
Zhijiang created FLINK-16284: Summary: Refactor SingleInputGate#setInputChannel to remove IntermediateResultPartitionID argument Key: FLINK-16284 URL: https://issues.apache.org/jira/browse/FLINK-16284 Project: Flink Issue Type: Improvement Components: Runtime / Network Reporter: Zhijiang Assignee: Zhijiang The IntermediateResultPartitionID info can be got directly from the respective InputChannel, so we can remove it from the arguments of SingleInputGate#setInputChannel to cleanup the codes. It is also helpful to simplify the unit tests and avoid passing the inconsistent IntermediateResultPartitionID with the internal ResultPartitionID that the respective InputChannel maintains. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739 ## CI report: * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN * 10a803dd074ee8be7ee6e15f93eeb417e2398392 Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150588163) Azure: [SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5595) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704 ## CI report: * 7c3daddc8a7a168912b63b528f719c967a2b76b8 Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150448826) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5559) * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150593392) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem
flinkbot edited a comment on issue #10890: [FLINK-15198][Deployment / Mesos] remove mesos.resourcemanager.tasks.mem URL: https://github.com/apache/flink/pull/10890#issuecomment-575711704 ## CI report: * 7c3daddc8a7a168912b63b528f719c967a2b76b8 Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150448826) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5559) * 1818f86b4da88d37a6d1f0c4a7b4b1f4fd41f961 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: [SUCCESS](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045135#comment-17045135 ] Zhu Zhu edited comment on FLINK-16069 at 2/26/20 4:43 AM: -- Making building deployment descriptor more static seems not to be an easy work. The critical path is the {{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> getConsumedPartitionShuffleDescriptors()}} which generates {{InputGateDeploymentDescriptor}} for a task. This relies on {{ShuffleDescriptor}} which are created from the producer tasks's {{ResultPartitionDeploymentDescriptor}}. {{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only after its producer has acquired the slot to determine the location. Besides that, even if we can create the {{InputGateDeploymentDescriptor}} in a static way and update it with the {{ResultPartitionDeploymentDescriptor}} when scheduling, seems we do not avoid the 8000x8000 complexity to compute the {{ShuffleDescriptor}}. I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by caching generated {{ShuffleDescriptor}}. e.g. In the case that A(parallelism=8000) -> B(parallelism=8000), each {{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 {{ShuffleDescriptor}}. So it is not needed to generate these 8000 shuffleDescriptors for 8000 times. was (Author: zhuzh): Making building deployment descriptor more static seems not to be an easy work. The critical path is the {{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> getConsumedPartitionShuffleDescriptors()}} which generates {{InputGateDeploymentDescriptor}}s for a task. This relies on {{ShuffleDescriptor}}s which are created from the producer tasks's {{ResultPartitionDeploymentDescriptor}}s. {{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only after its producer has acquired the slot to determine the location. Besides that, even if we can create the {{InputGateDeploymentDescriptor}} in a static way and update it with the {{ResultPartitionDeploymentDescriptor}} when scheduling, seems we do not avoid the 8000x8000 complexity to compute the {{ShuffleDescriptor}}s. I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by caching generated {{ShuffleDescriptor}}s. e.g. In the case that A(parallelism=8000) -> B(parallelism=8000), each {{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 {{ShuffleDescriptor}}s. So it is not needed to generate these 8000 shuffleDescriptors for 8000 times. > Creation of TaskDeploymentDescriptor can block main thread for long time > > > Key: FLINK-16069 > URL: https://issues.apache.org/jira/browse/FLINK-16069 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: huweihua >Priority: Major > > The deploy of tasks will take long time when we submit a high parallelism > job. And Execution#deploy run in mainThread, so it will block JobMaster > process other akka messages, such as Heartbeat. The creation of > TaskDeploymentDescriptor take most of time. We can put the creation in future. > For example, A job [source(8000)->sink(8000)], the total 16000 tasks from > SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of > TaskManager timeout and job never success. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045135#comment-17045135 ] Zhu Zhu commented on FLINK-16069: - Making building deployment descriptor more static seems not to be an easy work. The critical path is the {{TaskDeploymentDescriptorFactory#createInputGateDeploymentDescriptors() -> getConsumedPartitionShuffleDescriptors()}} which generates {{InputGateDeploymentDescriptor}}s for a task. This relies on {{ShuffleDescriptor}}s which are created from the producer tasks's {{ResultPartitionDeploymentDescriptor}}s. {{ResultPartitionDeploymentDescriptor}}, however, is created lazily, i.e. only after its producer has acquired the slot to determine the location. Besides that, even if we can create the {{InputGateDeploymentDescriptor}} in a static way and update it with the {{ResultPartitionDeploymentDescriptor}} when scheduling, seems we do not avoid the 8000x8000 complexity to compute the {{ShuffleDescriptor}}s. I'm thinking whether we can add a shortcut for the ALL-to-ALL pattern by caching generated {{ShuffleDescriptor}}s. e.g. In the case that A(parallelism=8000) -> B(parallelism=8000), each {{InputGateDeploymentDescriptor}} of B instances should contain the same 8000 {{ShuffleDescriptor}}s. So it is not needed to generate these 8000 shuffleDescriptors for 8000 times. > Creation of TaskDeploymentDescriptor can block main thread for long time > > > Key: FLINK-16069 > URL: https://issues.apache.org/jira/browse/FLINK-16069 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: huweihua >Priority: Major > > The deploy of tasks will take long time when we submit a high parallelism > job. And Execution#deploy run in mainThread, so it will block JobMaster > process other akka messages, such as Heartbeat. The creation of > TaskDeploymentDescriptor take most of time. We can put the creation in future. > For example, A job [source(8000)->sink(8000)], the total 16000 tasks from > SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of > TaskManager timeout and job never success. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] bowenli86 edited a comment on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition …
bowenli86 edited a comment on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition … URL: https://github.com/apache/flink/pull/11175#issuecomment-591232361 I have a slightly different opinion on this. Though it might mitigate the problem for users, we are indeed trying to hack the way around invalid inputs passed to our APIs. E.g. flink's file source will fail if the target is missing, rather than mitigate for users. Maybe @KurtYoung or @JingsongLi can give some opinions and help to merge if they think this is a proper fix? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] bowenli86 commented on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition …
bowenli86 commented on issue #11175: [FLINK-16197][hive] Failed to query partitioned table when partition … URL: https://github.com/apache/flink/pull/11175#issuecomment-591232361 I have a slightly different opinion on this. Though it might mitigate the problem for users, we are indeed trying to hack the way around invalid inputs passed to our APIs. E.g. flink's file source will fail if the target is missing, rather than mitigate for users. I'm generally neutral on this PR. Maybe @KurtYoung or @JingsongLi can give some opinions and help to merge if they think this is a proper fix? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739 ## CI report: * 66f60932087c6c77fff749c6670f64951db3a40f Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN * 10a803dd074ee8be7ee6e15f93eeb417e2398392 Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150588163) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150586977) Azure: [FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5594) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739 ## CI report: * 66f60932087c6c77fff749c6670f64951db3a40f Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN * 10a803dd074ee8be7ee6e15f93eeb417e2398392 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373 ] Caizhi Weng edited comment on FLINK-14807 at 2/26/20 4:01 AM: -- Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can not directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. So I come up with the following design. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the job master by the existing GlobalAggregateManager class. # When client want to fetch a portion of the query result, it will contact the JobManager. # The JobManager receives the RPC call and ask the socket server for results with a maximum size. # The socket server returns some results to the JobMaster. # JobMaster forwards the result to the client. Some Q for the design above: * TM memories are protected by backpressuring. Do we have to introduce a new config option to set the maximum memory usage of the sink? >> Yes * What if the client disconnects / does not connect? >> The job will not finish as the sink isn't finished. The sink blocks on the >> invoke method if its memory is full or blocks on the close method if not >> all results have been read by the client. We can also add a timeout check >> to the sink and let the sink fail if client goes away for a certain period >> of time. * How to deal with retract / upsert streams? >> The return type will be Tuple2 where the first boolean value >> indicates this row is an appending row or a retracting row. * Is the 1st step necessary? >> Yes, because the port of the socket server is unknown before created. Some problems to discuss * Is the whole design necessary? Why don't we use accumulators to store and send results? >> Accumulators cannot deal with large results. But apart from this >> accumulators seem to be OK. We can limit the maximum number of rows >> provided by Table#collect and use accumulators to send results to the >> JobMaster with each TM heartbeat. After collecting enough results the >> client can cancel the job. The biggest problem is that for streaming jobs >> we might have to wait until the next heartbeat (which is 10s by default) to >> get the results and decide whether to cancel the job. * What if the job restarts? >> This is a problem about what kind of API we want to provide. So this is >> actually a problem about motivation. Do we really need a (nearly) >> production-ready API or an API just for demo and testing? ** If we can tolerate an at least once API this does not seem to be a problem. We can attach the index and the version (increased with each job restart) of each results when sending them back to the client and let the client deal with all these version things. ** If we want an exactly once API I'm afraid this is very difficult. For batch jobs we have to use external storage and for streaming jobs we might have to force users to use checkpoints. Backpressures from sink may also affect checkpoints. was (Author: tsreaper): Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can not directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. So I come up with the following design. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the
[jira] [Commented] (FLINK-16267) Flink uses more memory than taskmanager.memory.process.size in Kubernetes
[ https://issues.apache.org/jira/browse/FLINK-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045126#comment-17045126 ] Yun Tang commented on FLINK-16267: -- [~czchen] From the taskmanager log, I cannot see how much memory reserved from RocksDB side, it should be about 1423MB and you can search you taskmanger log to search logs like: "Obtained shared RocksDB cache of size xxx bytes" to confirm. First of all, let us open the metrics of block cache usage from RocksDB side via turn [state.backend.rocksdb.metrics.block-cache-usage|https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#state-backend-rocksdb-metrics-block-cache-usage] as true to see the actual RocksDB memory usage. You can add that with memory-used metrics {{Status.JVM.Memory.Heap.Used}}, {{Status.JVM.Memory.NonHeap.Used}}, {{Status.JVM.Memory.Direct.MemoryUsed}} and {{Status.JVM.Memory.Mapped.MemoryUsed}} to see how much the total memory used per task manager. This targets to detect whether RocksDB lead the OOMkilled. Moreover, please also provide some information related with RocksDB: # How many states you use per operator? # Did you use RocksDB Map State and often iterator over it? # How many RocksDB instances per slot? This could be found via how many lines of "{{Obtained shared RocksDB cache of size"}} printed within one slot. > Flink uses more memory than taskmanager.memory.process.size in Kubernetes > - > > Key: FLINK-16267 > URL: https://issues.apache.org/jira/browse/FLINK-16267 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.10.0 >Reporter: ChangZhuo Chen (陳昌倬) >Priority: Major > Attachments: flink-conf_1.10.0.yaml, flink-conf_1.9.1.yaml, > oomkilled_taskmanager.log > > > This issue is from > [https://stackoverflow.com/questions/60336764/flink-uses-more-memory-than-taskmanager-memory-process-size-in-kubernetes] > h1. Description > * In Flink 1.10.0, we try to use `taskmanager.memory.process.size` to limit > the resource used by taskmanager to ensure they are not killed by Kubernetes. > However, we still get lots of taskmanager `OOMKilled`. The setup is in the > following section. > * The taskmanager log is in attachment [^oomkilled_taskmanager.log]. > h2. Kubernete > * The Kubernetes setup is the same as described in > [https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html]. > * The following is resource configuration for taskmanager deployment in > Kubernetes: > {{resources:}} > {{ requests:}} > {{ cpu: 1000m}} > {{ memory: 4096Mi}} > {{ limits:}} > {{ cpu: 1000m}} > {{ memory: 4096Mi}} > h2. Flink Docker > * The Flink docker is built by the following Docker file. > {{FROM flink:1.10-scala_2.11}} > RUN mkdir -p /opt/flink/plugins/s3 && > ln -s /opt/flink/opt/flink-s3-fs-presto-1.10.0.jar /opt/flink/plugins/s3/ > {{RUN ln -s /opt/flink/opt/flink-metrics-prometheus-1.10.0.jar > /opt/flink/lib/}} > h2. Flink Configuration > * The following are all memory related configurations in `flink-conf.yaml` > in 1.10.0: > {{jobmanager.heap.size: 820m}} > {{taskmanager.memory.jvm-metaspace.size: 128m}} > {{taskmanager.memory.process.size: 4096m}} > * We use RocksDB and we don't set `state.backend.rocksdb.memory.managed` in > `flink-conf.yaml`. > ** Use S3 as checkpoint storage. > * The code uses DateStream API > ** input/output are both Kafka. > h2. Project Dependencies > * The following is our dependencies. > {{val flinkVersion = "1.10.0"}}{{libraryDependencies += > "com.squareup.okhttp3" % "okhttp" % "4.2.2"}} > {{libraryDependencies += "com.typesafe" % "config" % "1.4.0"}} > {{libraryDependencies += "joda-time" % "joda-time" % "2.10.5"}} > {{libraryDependencies += "org.apache.flink" %% "flink-connector-kafka" % > flinkVersion}} > {{libraryDependencies += "org.apache.flink" % "flink-metrics-dropwizard" % > flinkVersion}} > {{libraryDependencies += "org.apache.flink" %% "flink-scala" % flinkVersion > % "provided"}} > {{libraryDependencies += "org.apache.flink" %% "flink-statebackend-rocksdb" > % flinkVersion % "provided"}} > {{libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % > flinkVersion % "provided"}} > {{libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.6.7"}} > {{libraryDependencies += "org.log4s" %% "log4s" % "1.8.2"}} > {{libraryDependencies += "org.rogach" %% "scallop" % "3.3.1"}} > h2. Previous Flink 1.9.1 Configuration > * The configuration we used in Flink 1.9.1 are the following. It does not > have `OOMKilled`. > h3. Kubernetes > {{resources:}} > {{ requests:}} > {{ cpu: 1200m}} > {{ memory: 2G}} > {{ limits:}} > {{ cpu: 1500m}} > {{ memory: 2G}} > h3. Flink 1.9.1 >
[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373 ] Caizhi Weng edited comment on FLINK-14807 at 2/26/20 3:55 AM: -- Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can not directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. So I come up with the following design. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the job master by the existing GlobalAggregateManager class. # When client want to fetch a portion of the query result, it will contact the JobManager. # The JobManager receives the RPC call and ask the socket server for results with a maximum size. # The socket server returns some results to the JobMaster. # JobMaster forwards the result to the client. Some Q for the design above: * TM memories are protected by backpressuring. Do we have to introduce a new config option to set the maximum memory usage of the sink? >> Yes * What if the client disconnects / does not connect? >> The job will not finish as the sink isn't finished. The sink blocks on the >> invoke method if its memory is full or blocks on the close method if not >> all results have been read by the client. We can also add a timeout check >> to the sink and let the sink fail if client goes away for a certain period >> of time. * How to deal with retract / upsert streams? >> The return type will be Tuple2 where the first boolean value >> indicates this row is an appending row or a retracting row. * Is the 1st step necessary? >> Yes, because the port of the socket server is unknown before created. Some problems to discuss * Is the whole design necessary? Why don't we use accumulators to store and send results? >> Accumulators cannot deal with large results. But apart from this >> accumulators seem to be OK. We can limit the maximum number of rows >> provided by Table#collect and use accumulators to send results to the >> JobMaster with each TM heartbeat. After collecting enough results the >> client can cancel the job. The biggest problem is that for streaming jobs >> we might have to wait until the next heartbeat (which is 10s by default) to >> get the results and decide whether to cancel the job. * What if the job restarts? >> This is a problem about what kind of API we want to provide. ** If we can tolerate an at least once API this does not seem to be a problem. We can attach the index and the version (increased with each job restart) of each results when sending them back to the client and let the client deal with all these version things. ** If we want an exactly once API I'm afraid this is very difficult. For batch jobs we have to use external storage and for streaming jobs we might have to force users to use checkpoints. Backpressures from sink may also affect checkpoints. was (Author: tsreaper): Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can not directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. So I come up with the following design. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the job master by the existing GlobalAggregateManager class. # When client want to fetch a portion of the query result, it will contact the
[jira] [Issue Comment Deleted] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caizhi Weng updated FLINK-14807: Comment: was deleted (was: Thanks for the comment [~godfreyhe]. I'll list more details about my design below. * How can sink tell the REST server its address and port? >> This is the hardest part of the design. I actually haven't come up with a >> very good solution. I now have three options listed below: ** Option A. [FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface] will introduce Operator Coordinator which can perfectly solve this issue. But according to [~jqin] we won't have this feature until 1.11. ** Option B. We can use accumulators to tell the job manager the address and port of the socket server. But as accumulators are sent with heartbeats and the default interval between heartbeats are 10s, this will greatly impact small jobs. ** Option C. Extract TaskConfig from JobGraph in job manager and insert server information into it. This requires the socket server to start in the job manager instead of sink, and it seems to be quite hacky... * What if a job without ordering restarts? >> As streaming jobs are backed by checkpoints this is not a problem. For >> batch jobs I'm afraid we'll have to introduce a special element in the >> resulting iterator indicating that the previous results provided are now >> invalid.) > Add Table#collect api for fetching data to client > - > > Key: FLINK-14807 > URL: https://issues.apache.org/jira/browse/FLINK-14807 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.1 >Reporter: Jeff Zhang >Priority: Major > Labels: usability > Fix For: 1.11.0 > > Attachments: table-collect.png > > > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > > Other apis such as Table#head, Table#print is also helpful. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033373#comment-17033373 ] Caizhi Weng edited comment on FLINK-14807 at 2/26/20 3:51 AM: -- Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can not directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. So I come up with the following design. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the job master by the existing GlobalAggregateManager class. # When client want to fetch a portion of the query result, it will contact the JobManager. # The JobManager receives the RPC call and ask the socket server for results with a maximum size. # The socket server returns some results to the JobMaster. # JobMaster forwards the result to the client. Some Q for the design above: * TM memories are protected by backpressuring. Do we have to introduce a new config option to set the maximum memory usage of the sink? >> Yes * What if the client disconnects / does not connect? >> The job will not finish as the sink isn't finished. The sink blocks on the >> invoke method if its memory is full or blocks on the close method if not >> all results have been read by the client. * How to deal with retract / upsert streams? >> The return type will be Tuple2 where the first boolean value >> indicates this row is an appending row or a retracting row. * Is the 1st step necessary? >> Yes, because the port of the socket server is unknown before created. Some problems to discuss * Is the whole design necessary? Why don't we use accumulators to store and send results? >> Accumulators cannot deal with large results. But apart from this >> accumulators seem to be OK. We can limit the maximum number of rows >> provided by Table#collect and use accumulators to send results to the >> JobMaster with each TM heartbeat. After collecting enough results the >> client can cancel the job. The biggest problem is that for streaming jobs >> we might have to wait until the next heartbeat (which is 10s by default) to >> get the results and decide whether to cancel the job. * What if the job restarts? >> This is a problem about what kind of API we want to provide. ** If we can tolerate an at least once API this does not seem to be a problem. We can attach the index and the version (increased with each job restart) of each results when sending them back to the client and let the client deal with all these version things. ** If we want an exactly once API I'm afraid this is very difficult. For batch jobs we have to use external storage and for streaming jobs we might have to force users to use checkpoints. Backpressures from sink may also affect checkpoints. was (Author: tsreaper): Hi dear Flink community, I'm digging this up because I have some thoughts on implementing this. I'll post my thoughts below and I'd love to hear your opinions. >From my understanding, this problem has two key points. One is that where to >store the (possibly) never-ending results, and the other is that task managers >can directly communicate with clients under some environments like k8s or >yarn. For the never-ending results, back pressuring will work to limit the >size of data in the whole cluster. For the communication between task managers >and clients, job manager must be the man in the middle as clients and task >managers are guaranteed to directly communicate with job manager. As REST API is the main communication method between clients and job managers, I'm going to introduce two new internal REST API and a new sink to do the job. The new sink will only have 1 parallelism. The class names and API names below are just placeholders. !table-collect.png|width=600! # When the sink is initialized in task managers, it will create a socket server for providing the query results. The IP and port of the socket server will be given to the REST server by a REST API call. # When client want to fetch a portion of the query result, it will provide the job id and a token to the REST server. Here
[GitHub] [flink] dianfu commented on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
dianfu commented on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203#issuecomment-591225209 @HuangXingBo Thanks for the PR. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhuzhurk commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order
zhuzhurk commented on issue #11187: [FLINK-16234]Use LinkedHashSet for a deterministic iteration order URL: https://github.com/apache/flink/pull/11187#issuecomment-591225130 Thanks for reporting this issue and trying to fix it @cpugputpu . I second with @StephanEwen that we should fix the tests instead (`testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultEnabled` and `testSlotSharingOnAllVerticesInSameSlotSharingGroupByDefaultDisabled`). We should not assume that the topological order of sources is the same as the order that they are added into the StreamGraph. To fix those problematic tests, one option is to find the wanted vertex via the name rather than via the index. @cpugputpu do you want to fix it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot edited a comment on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea Travis: [PENDING](https://travis-ci.com/flink-ci/flink/builds/150586977) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation
flinkbot edited a comment on issue #11203: [FLINK-16251][python] Optimize the cost of function call in ScalarFunctionOpertation URL: https://github.com/apache/flink/pull/11203#issuecomment-590636739 ## CI report: * 66f60932087c6c77fff749c6670f64951db3a40f Travis: [FAILURE](https://travis-ci.com/flink-ci/flink/builds/150406289) * cad308b6fe40defa5b6b33a74aeb8bc6e99a6759 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caizhi Weng updated FLINK-14807: Attachment: (was: table-collect.png) > Add Table#collect api for fetching data to client > - > > Key: FLINK-14807 > URL: https://issues.apache.org/jira/browse/FLINK-14807 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.1 >Reporter: Jeff Zhang >Priority: Major > Labels: usability > Fix For: 1.11.0 > > Attachments: table-collect.png > > > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > > Other apis such as Table#head, Table#print is also helpful. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caizhi Weng updated FLINK-14807: Attachment: table-collect.png > Add Table#collect api for fetching data to client > - > > Key: FLINK-14807 > URL: https://issues.apache.org/jira/browse/FLINK-14807 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.1 >Reporter: Jeff Zhang >Priority: Major > Labels: usability > Fix For: 1.11.0 > > Attachments: table-collect.png > > > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > > Other apis such as Table#head, Table#print is also helpful. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14807) Add Table#collect api for fetching data to client
[ https://issues.apache.org/jira/browse/FLINK-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045113#comment-17045113 ] Jiangjie Qin commented on FLINK-14807: -- We should probably first agree on the right solution to solve this problem and then think about the path to achieve that. I believe the Operator Coordinator would be the right way to go, for the following reasons: # The Operator Coordinator was introduced to provided a bi-direction communication mechanism between JM and TM, so it is suitable in this case. And in the new Sink design, we will have something like {{SinkCoordinator}} and {{SinkWriter}}, just like what we have in the Source. # Personally I prefer to have the socket server brought up in the Sink Coordinator and then broadcast the port info to the TMs. This simplifies the architecture and can remove the restriction of the parallelism on the Sink. While the SinkCoordinator can solve the communication between JM and TM in both control plane via operator events and data plane via a socket server in the coordinator, the communication problem between the clients and JM still exists, i.e. how to send the records back from JM to the client. It is true that we can introduce a REST api in JM and let the clients fetch records through it, but it would worth thinking if we can have a more generic solution that may potentially help in other cases as well. I think it might be useful to establish a way for the clients to get information from the Operator Coordinators. For example, the REST API in the JM can allow the client to query the information of a particular coordinator, and the coordinator can return its socket server port back to the client. It is a more flexible and more generic solution as in the future, people can have other communication with the coordinators without changing any code in the JM. I also have a questions regarding the failover. When the JM fails over, how can the clients connect to the JM after the failover? > Add Table#collect api for fetching data to client > - > > Key: FLINK-14807 > URL: https://issues.apache.org/jira/browse/FLINK-14807 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.1 >Reporter: Jeff Zhang >Priority: Major > Labels: usability > Fix For: 1.11.0 > > Attachments: table-collect.png > > > Currently, it is very unconvinient for user to fetch data of flink job unless > specify sink expclitly and then fetch data from this sink via its api (e.g. > write to hdfs sink, then read data from hdfs). However, most of time user > just want to get the data and do whatever processing he want. So it is very > necessary for flink to provide api Table#collect for this purpose. > > Other apis such as Table#head, Table#print is also helpful. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591220639 ## CI report: * 3b78c9b691fc40bd24f50103b704bf9e1c6396ea UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] dianfu commented on issue #10995: [FLINK-15847][ml] Include flink-ml-api and flink-ml-lib in opt
dianfu commented on issue #10995: [FLINK-15847][ml] Include flink-ml-api and flink-ml-lib in opt URL: https://github.com/apache/flink/pull/10995#issuecomment-591220343 @becketqin Would you like to take a final look at this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function
flinkbot commented on issue #11217: [FLINK-16283][table-planner] Fix potential NullPointerException when invoking close() on generated function URL: https://github.com/apache/flink/pull/11217#issuecomment-591216599 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 3b78c9b691fc40bd24f50103b704bf9e1c6396ea (Wed Feb 26 03:14:15 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-16283) NullPointerException in GroupAggProcessFunction.close()
[ https://issues.apache.org/jira/browse/FLINK-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-16283: --- Labels: pull-request-available test-stability (was: test-stability) > NullPointerException in GroupAggProcessFunction.close() > --- > > Key: FLINK-16283 > URL: https://issues.apache.org/jira/browse/FLINK-16283 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.11.0 >Reporter: Robert Metzger >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available, test-stability > > CI run: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=5586=logs=b1623ac9-0979-5b0d-2e5e-1377d695c991=48867695-c47f-5af3-2f21-7845611247b9 > {code} > java.lang.NullPointerException: null > at > org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.close(GroupAggProcessFunction.scala:182) > ~[flink-table_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:627) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:565) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:483) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:717) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)