[jira] [Comment Edited] (FLINK-18068) Job scheduling stops but not exits after throwing non-fatal exception
[ https://issues.apache.org/jira/browse/FLINK-18068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124685#comment-17124685 ] Till Rohrmann edited comment on FLINK-18068 at 6/3/20, 7:31 AM: I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. I think the proper solution would be that per default we propagate all thrown exceptions. At places where user code exceptions can occur and if they are tolerable, the higher level code should catch it and handle it properly. The thing we need to make sure is that we don't use unchecked exceptions for the normal control flow. Differently said, whenever an exception reaches the {{AkkaRpcActor}} level, then it must be unrecoverable. was (Author: till.rohrmann): I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. I think the proper solution would be that per default we propagate all thrown exceptions. At places where user code exceptions can occur and if they are tolerable, the higher level code should catch it and handle it properly. The problem I see with this approach is that we are using at some places exceptions for normal control flow (e.g. when requesting a job which does not exist, then we throw a {{FlinkJobNotFoundException}}). I think this is a mistake and instead we should return a more meaningful return value instead (e.g. {{Optional}}). But this is a bigger effort to change. > Job scheduling stops but not exits after throwing non-fatal exception > - > > Key: FLINK-18068 > URL: https://issues.apache.org/jira/browse/FLINK-18068 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.10.1 >Reporter: Jiayi Liao >Priority: Major > > The batch job will stop but still be alive with doing nothing for a long time > (maybe forever?) if any non fatal exception is thrown from interacting with > YARN. Here is the example : > {code:java} > java.lang.IllegalStateException: The RMClient's and YarnResourceManagers > internal state about the number of pending container requests for resource > has diverged. Number client's pending container > requests 40 != Number RM's pending container requests 0. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:217) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.getPendingRequestsAndCheckConsistency(YarnResourceManager.java:518) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.onContainersOfResourceAllocated(YarnResourceManager.java:431) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:395) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > [flink-ljy-1.0.jar:?] > at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > [flink-ljy-1.0.jar:?] > at >
[GitHub] [flink] leonardBang commented on pull request #12436: [FLINK-17847][table sql / planner] ArrayIndexOutOfBoundsException happens in StreamExecCalc operator
leonardBang commented on pull request #12436: URL: https://github.com/apache/flink/pull/12436#issuecomment-638018884 I updated the PR @libenchao @wuchong ,could you take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18085) Manually test application mode for standalone, Yarn, K8s
[ https://issues.apache.org/jira/browse/FLINK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-18085: -- Fix Version/s: 1.11.0 > Manually test application mode for standalone, Yarn, K8s > > > Key: FLINK-18085 > URL: https://issues.apache.org/jira/browse/FLINK-18085 > Project: Flink > Issue Type: Sub-task >Reporter: Yang Wang >Priority: Critical > Fix For: 1.11.0 > > > We have introduced the application mode from 1.11. It need to be tested in a > real cluster with following functionality check. > * Application normally start and finish > * Application failed exceptionally > * HA configured, kill jobmanager and job should recover from latest > checkpoint > > For Yarn deployment, we also need to verify it could work with provided lib > and remote user jars. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #12456: [FLINK-17113][sql-cli] Refactor view support in SQL Client
flinkbot commented on pull request #12456: URL: https://github.com/apache/flink/pull/12456#issuecomment-638018167 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit b5b8172d4ccddf197a084d38e6dcef13ee27da8e (Wed Jun 03 07:36:46 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-17113) Refactor view support in SQL Client
[ https://issues.apache.org/jira/browse/FLINK-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-17113: --- Labels: pull-request-available (was: ) > Refactor view support in SQL Client > --- > > Key: FLINK-17113 > URL: https://issues.apache.org/jira/browse/FLINK-17113 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Affects Versions: 1.10.0 >Reporter: Zhenghua Gao >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #11397: [FLINK-16217] [sql-client] catch SqlExecutionException for all callXX methods
flinkbot edited a comment on pull request #11397: URL: https://github.com/apache/flink/pull/11397#issuecomment-598575354 ## CI report: * 209b426a5818e8a230bb63499a921398170281fc Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2606) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] leonardBang commented on a change in pull request #12436: [FLINK-17847][table sql / planner] ArrayIndexOutOfBoundsException happens in StreamExecCalc operator
leonardBang commented on a change in pull request #12436: URL: https://github.com/apache/flink/pull/12436#discussion_r434381251 ## File path: flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala ## @@ -1643,19 +1648,28 @@ object ScalarOperatorGens { val resultTypeTerm = primitiveTypeTermForType(componentInfo) val defaultTerm = primitiveDefaultValue(componentInfo) +index.literalValue match { + case Some(v: Int) if v < 1 => +throw new ValidationException( + s"Array element access needs an index starting at 1 but was $v.") + case _ => //nothing +} val idxStr = s"${index.resultTerm} - 1" val arrayIsNull = s"${array.resultTerm}.isNullAt($idxStr)" val arrayGet = rowFieldReadAccess(ctx, idxStr, array.resultTerm, componentInfo) val arrayAccessCode = - s""" - |${array.code} - |${index.code} - |boolean $nullTerm = ${array.nullTerm} || ${index.nullTerm} || $arrayIsNull; - |$resultTypeTerm $resultTerm = $nullTerm ? $defaultTerm : $arrayGet; - |""".stripMargin - +s""" Review comment: the generated code will too long If not, I'm ok to keep them in shorter lines This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zhuzhurk commented on a change in pull request #12321: Add document for writing Avro files with StreamingFileSink
zhuzhurk commented on a change in pull request #12321: URL: https://github.com/apache/flink/pull/12321#discussion_r434380806 ## File path: docs/dev/connectors/streamfile_sink.md ## @@ -204,6 +205,65 @@ input.addSink(sink) + Avro format + +Flink also provides built-in support for writing data into Avro files. A list of convenience methods to create +Avro writer factories and their associated documentation can be found in the +[AvroWriters]({{ site.javadocs_baseurl }}/api/java/org/apache/flink/formats/avro/AvroWriters.html) class. + +For creating customized Avro writers like enabling compression, users need to create the `AvroWriterFactory` +with a custom implementation of the [AvroBuilder]({{ site.javadocs_baseurl }}/api/java/org/apache/flink/formats/avro/AvroBuilder.html) interface. Review comment: Would it make sense to add an example on creating customized Avro writers? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-18026) E2E tests manually for new SQL connectors and formats
[ https://issues.apache.org/jira/browse/FLINK-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young reassigned FLINK-18026: -- Assignee: Shengkai Fang > E2E tests manually for new SQL connectors and formats > - > > Key: FLINK-18026 > URL: https://issues.apache.org/jira/browse/FLINK-18026 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka, Tests >Affects Versions: 1.11.0 >Reporter: Danny Chen >Assignee: Shengkai Fang >Priority: Blocker > Fix For: 1.11.0 > > > Use the SQL-CLI to test all kinds of new formats with the new Kafka source. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-18028) E2E tests manually for Kafka 2 all kinds of other connectors
[ https://issues.apache.org/jira/browse/FLINK-18028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young reassigned FLINK-18028: -- Assignee: Shengkai Fang > E2E tests manually for Kafka 2 all kinds of other connectors > > > Key: FLINK-18028 > URL: https://issues.apache.org/jira/browse/FLINK-18028 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka, Tests >Affects Versions: 1.11.0 >Reporter: Danny Chen >Assignee: Shengkai Fang >Priority: Blocker > Fix For: 1.11.0 > > > - test Kafka 2 MySQL > - test Kafka 2 ES > - test Kafka ES temporal join > - test Kafka MySQL temporal join > - test Kafka Hbase temporal join -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18033) Improve e2e test execution time
[ https://issues.apache.org/jira/browse/FLINK-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124737#comment-17124737 ] Robert Metzger edited comment on FLINK-18033 at 6/3/20, 8:13 AM: - Thanks a lot for opening this ticket. I agree that the e2e tests are running for quite a while, and I'm pretty sure that there's a lot of room for improvement in the scripts. I see two ways forward (which are orthogonal to each other) A) Investigate the slowest e2e tests and potential inefficiencies in the common scripts (cluster startup and shutdown etc.). The goal should not be to optimize the last few seconds, but rather the cases where we are wasting minutes of time waiting. B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler" We have the following CI environments: 1. Flink: Servers providing an environment to run tests on Docker; VMs from Azure to run tests on 'bare metal' 2. personal Azure accounts: Only VMs from Azure At the moment, we can not execute the e2e tests requiring Docker or Kubernetes in the Docker environment (need more research into that) In the past, we've manually split the e2e tests into separate scripts. This has lead to "code duplication" and a perceived complexity in the e2e tests. Given these constraints, I imagine the following design of a "e2e test scheduler". 1. Registration of tests: Tests can have the following properties: - bare_metal = require a bare metal environment (azure vm) - light = have a resource footprint below 1GB of memory and low CPU usage (< 10%) - heavy = have a resource footprint up to 8GB of memory and high CPU usage {code:bash} register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs register_test properties=bare_metal test_kubernetes_embedded_job.sh register_test properties=light test_quickstarts.sh java register_test properties=heavy test_tpch.sh {code} (Note: we could determine the light / heavyness of tests using a tool like: https://github.com/astrofrog/psrecord) 2. Execution: In each environment, we run tests through a call like: {code:bash} execute_test properties=bare_metal # on the azure VMs execute_test properties=light paralleism=8. # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently execute_test properties=heavy paralleism=1. # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently {code} For now, the Java e2e tests are executed outside of the scheduler. was (Author: rmetzger): Thanks a lot for opening this ticket. I agree that the e2e tests are running for quite a while, and I'm pretty sure that there's a lot of room for improvement in the scripts. I see two ways forward (which are orthogonal to each other) A) Investigate the slowest e2e tests and potential inefficiencies in the common scripts (cluster startup and shutdown etc.). The goal should not be to optimize the last few seconds, but rather the cases where we are wasting minutes of time waiting. B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler" We have the following CI environments: 1. Flink: Servers providing an environment to run tests on Docker; VMs from Azure to run tests on 'bare metal' 2. personal Azure accounts: Only VMs from Azure At the moment, we can not execute the e2e tests requiring Docker or Kubernetes in the Docker environment (need more research into that) In the past, we've manually split the e2e tests into separate scripts. This has lead to "code duplication" and a perceived complexity in the e2e tests. Given these constraints, I imagine the following design of a "e2e test scheduler". 1. Registration of tests: Tests can have the following properties: - bare_metal = require a bare metal environment (azure vm) - light = have a resource footprint below 1GB of memory and low CPU usage (< 10%) - heavy = have a resource footprint up to 8GB of memory and high CPU usage {code:bash} register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs register_test properties=bare_metal test_kubernetes_embedded_job.sh register_test properties=light test_quickstarts.sh java register_test properties=heavy test_tpch.sh {code} (Note: we could determine the light / heavyness of tests using a tool like: https://github.com/astrofrog/psrecord) 2. Execution: In each environment, we run tests through a call like: {code:bash} execute_test properties=bare_metal# on the azure VMs execute_test properties=light paralleism=8# on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently execute_test properties=heavy paralleism=1 # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently {code} For now, the Java e2e tests are executed outside of the scheduler. > Improve e2e test execution time > --- > >
[GitHub] [flink] rmetzger opened a new pull request #12458: [FLINK-17404] Make sure netty 3.10.6 is used in flink-runtime
rmetzger opened a new pull request #12458: URL: https://github.com/apache/flink/pull/12458 Due to the recent changes in https://issues.apache.org/jira/browse/FLINK-11086 ("Add support for Hadoop 3"), the shaded netty dependency in flink-runtime changed depending on the Hadoop dependency version. The Hadoop 3 change affects the Netty version of flink-runtime depending on the hadoop version you are compiling Flink with: - our akka expects netty 3.10.6 - with Hadoop 2.4.1 and Hadoop 2.8.3, flink-runtime shades netty 3.6.2 (the e2e test passes) - with Hadoop 3.1.3 netty is at 3.10.5 (the e2e test fails reliably) We add Netty 3.10.6 as a dependency to flink-runtime to make sure it is not overridden by any other dependencies. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-16175) Add config option to switch case sensitive for column names in SQL
[ https://issues.apache.org/jira/browse/FLINK-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-16175: --- Fix Version/s: (was: 1.11.0) 1.12.0 > Add config option to switch case sensitive for column names in SQL > -- > > Key: FLINK-16175 > URL: https://issues.apache.org/jira/browse/FLINK-16175 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API, Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available, usability > Fix For: 1.12.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Flink SQL is default CaseSensitive and have no option to config. This issue > aims to support > a configOption so that user can set CaseSensitive for their SQL. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17948) Sql client lost precision for Timestamp and Decimal Data type
[ https://issues.apache.org/jira/browse/FLINK-17948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-17948: --- Fix Version/s: (was: 1.11.0) > Sql client lost precision for Timestamp and Decimal Data type > - > > Key: FLINK-17948 > URL: https://issues.apache.org/jira/browse/FLINK-17948 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.11.0 > Environment: mysql: > image: mysql:8.0 > volumes: > - ./mysql/mktable.sql:/docker-entrypoint-initdb.d/mktable.sql > environment: > MYSQL_ROOT_PASSWORD: 123456 > ports: > - "3306:3306" >Reporter: Shengkai Fang >Assignee: godfrey he >Priority: Major > Attachments: image-2020-05-26-22-56-43-835.png, > image-2020-05-26-22-58-02-326.png > > > My job is following: > > {code:java} > CREATE TABLE currency ( > currency_id BIGINT, > currency_name STRING, > rate DOUBLE, > currency_timestamp TIMESTAMP, > country STRING, > precise_timestamp TIMESTAMP(6), > precise_time TIME(6), > gdp DECIMAL(10, 6) > ) WITH ( >'connector' = 'jdbc', >'url' = 'jdbc:mysql://localhost:3306/flink', >'username' = 'root', >'password' = '123456', >'table-name' = 'currency', >'driver' = 'com.mysql.jdbc.Driver', >'lookup.cache.max-rows' = '500', >'lookup.cache.ttl' = '10s', >'lookup.max-retries' = '3') > {code} > When select * from currency, the precision of results are not as same as > expected. The precision of field precise_timestamp is 3 not 6, and the > field gdp has more digit as expected. > > !image-2020-05-26-22-56-43-835.png! > The data in mysql is following: > !image-2020-05-26-22-58-02-326.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17895) Default value of rows-per-second in datagen can be limited
[ https://issues.apache.org/jira/browse/FLINK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-17895: --- Fix Version/s: (was: 1.11.0) > Default value of rows-per-second in datagen can be limited > -- > > Key: FLINK-17895 > URL: https://issues.apache.org/jira/browse/FLINK-17895 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Minor > Labels: pull-request-available, starter > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-17959) Exception: "CANCELLED: call already cancelled" is thrown when run python udf
[ https://issues.apache.org/jira/browse/FLINK-17959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-17959: --- Assignee: Dian Fu > Exception: "CANCELLED: call already cancelled" is thrown when run python udf > > > Key: FLINK-17959 > URL: https://issues.apache.org/jira/browse/FLINK-17959 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.10.1, 1.11.0 >Reporter: Hequn Cheng >Assignee: Dian Fu >Priority: Major > > The exception is thrown when running Python UDF: > {code:java} > May 27, 2020 3:20:49 PM > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.SerializingExecutor run > SEVERE: Exception while executing runnable > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1Closed@3960b30e > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: > CANCELLED: call already cancelled > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onCompleted(ServerCalls.java:366) > at > org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onError(GrpcStateService.java:145) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onCancel(ServerCalls.java:270) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.PartialForwardingServerCallListener.onCancel(PartialForwardingServerCallListener.java:40) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ForwardingServerCallListener.onCancel(ForwardingServerCallListener.java:23) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onCancel(ForwardingServerCallListener.java:40) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Contexts$ContextualizedServerCallListener.onCancel(Contexts.java:96) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.closed(ServerCallImpl.java:337) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1Closed.runInContext(ServerImpl.java:793) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > The job can output the right results however it seems something goes wrong > during the shutdown procedure. > You can reproduce the exception with the following code(note: the exception > happens occasionally): > {code} > from pyflink.datastream import StreamExecutionEnvironment > from pyflink.table import StreamTableEnvironment, DataTypes > from pyflink.table.descriptors import Schema, OldCsv, FileSystem > from pyflink.table.udf import udf > env = StreamExecutionEnvironment.get_execution_environment() > env.set_parallelism(1) > t_env = StreamTableEnvironment.create(env) > add = udf(lambda i, j: i + j, [DataTypes.BIGINT(), DataTypes.BIGINT()], > DataTypes.BIGINT()) > t_env.register_function("add", add) > t_env.connect(FileSystem().path('/tmp/input')) \ > .with_format(OldCsv() > .field('a', DataTypes.BIGINT()) > .field('b', DataTypes.BIGINT())) \ > .with_schema(Schema() > .field('a', DataTypes.BIGINT()) > .field('b', DataTypes.BIGINT())) \ > .create_temporary_table('mySource') > t_env.connect(FileSystem().path('/tmp/output')) \ > .with_format(OldCsv() > .field('sum', DataTypes.BIGINT())) \ > .with_schema(Schema() > .field('sum', DataTypes.BIGINT())) \ > .create_temporary_table('mySink') > t_env.from_path('mySource')\ > .select("add(a, b)") \ > .insert_into('mySink') > t_env.execute("tutorial_job") > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-18061) TableResult#collect should return closeable iterator to avoid resource leak
[ https://issues.apache.org/jira/browse/FLINK-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young reassigned FLINK-18061: -- Assignee: godfrey he > TableResult#collect should return closeable iterator to avoid resource leak > --- > > Key: FLINK-18061 > URL: https://issues.apache.org/jira/browse/FLINK-18061 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: godfrey he >Assignee: godfrey he >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > as discussed in ML: > http://mail-archives.apache.org/mod_mbox/flink-dev/202005.mbox/%3cd4ee47e1-0214-aa2f-f5ac-c9daf708e...@apache.org%3e, > we should return a closeable iterator for TableResult#collect method *to > avoid resource leak*. The suggested change is: > {code:java} > public interface TableResult { > CloseableIterator collect(); > } > {code} > we use existing {{org.apache.flink.util.CloseableIterator}} instead of > additional api class, and {{CollectResultIterator}} can also inherit from > {{CloseableIterator}}. > This change does not break current api. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zhijiangW commented on pull request #12461: [FLINK-17994][checkpointing] Fix the race condition between CheckpointBarrierUnaligner#processBarrier and #notifyBarrierReceived
zhijiangW commented on pull request #12461: URL: https://github.com/apache/flink/pull/12461#issuecomment-638089208 Just cherry-pick this fix to master from releasee-1.11, which has already been approved before in #12406 , will merge it after azure pass. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-15467) Should wait for the end of the source thread during the Task cancellation
[ https://issues.apache.org/jira/browse/FLINK-15467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124822#comment-17124822 ] Piotr Nowojski commented on FLINK-15467: Thanks for coming back and providing a way to reproduce the problem. I haven't run your code, but it looks like you are right. I'm not entirely sure how to fix this from the top of my head, someone would have to investigate this further. > Should wait for the end of the source thread during the Task cancellation > - > > Key: FLINK-15467 > URL: https://issues.apache.org/jira/browse/FLINK-15467 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.9.0, 1.9.1, 1.10.1 >Reporter: ming li >Priority: Critical > Fix For: 1.12.0 > > > In the new mailBox model, SourceStreamTask starts a source thread to run > user methods, and the current execution thread will block on mailbox.takeMail > (). When a task cancels, the TaskCanceler thread will cancel the task and > interrupt the execution thread. Therefore, the execution thread of > SourceStreamTask will throw InterruptedException, then cancel the task again, > and throw an exception. > {code:java} > //代码占位符 > @Override > protected void performDefaultAction(ActionContext context) throws Exception { >// Against the usual contract of this method, this implementation is not > step-wise but blocking instead for >// compatibility reasons with the current source interface (source > functions run as a loop, not in steps). >sourceThread.start(); >// We run an alternative mailbox loop that does not involve default > actions and synchronizes around actions. >try { > runAlternativeMailboxLoop(); >} catch (Exception mailboxEx) { > // We cancel the source function if some runtime exception escaped the > mailbox. > if (!isCanceled()) { > cancelTask(); > } > throw mailboxEx; >} >sourceThread.join(); >if (!isFinished) { > sourceThread.checkThrowSourceExecutionException(); >} >context.allActionsCompleted(); > } > {code} > When all tasks of this TaskExecutor are canceled, the blob file will be > cleaned up. But the real source thread is not finished at this time, which > will cause a ClassNotFoundException when loading a new class. In this case, > the source thread may not be able to properly clean up and release resources > (such as closing child threads, cleaning up local files, etc.). Therefore, I > think we should mark this task canceled or finished after the execution of > the source thread is completed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong closed pull request #12386: [FLINK-17995][docs][connectors] Redesign Table & SQL Connectors page and add HBase connector documentation
wuchong closed pull request #12386: URL: https://github.com/apache/flink/pull/12386 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434475743 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. +Overall Zeppelin supports 3 kinds of streaming data analytics: +* Single +* Update +* Append + +### Single Mode +Single mode is for the case when the result of sql statement is always one row, such as the following example. +The output format is HTML, and you can specify paragraph local property template for the final output content template. +And you can use {i} as placeholder for the ith column of result. + + + + + +### Update Mode +Update mode is suitable for the case when the output is more than one rows, Review comment: ```suggestion Update mode is suitable for the cases when the output format is more than one row, ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434475238 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. +Overall Zeppelin supports 3 kinds of streaming data analytics: +* Single +* Update +* Append + +### Single Mode +Single mode is for the case when the result of sql statement is always one row, such as the following example. +The output format is HTML, and you can specify paragraph local property template for the final output content template. +And you can use {i} as placeholder for the ith column of result. Review comment: I am not sure what "ith" maybe you want to ad {i}th if you want to say 9th or 8th? It should also be "column of **the** result" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-18087) Uploading user artifact for Yarn job cluster could not work
[ https://issues.apache.org/jira/browse/FLINK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas reassigned FLINK-18087: -- Assignee: Yang Wang > Uploading user artifact for Yarn job cluster could not work > --- > > Key: FLINK-18087 > URL: https://issues.apache.org/jira/browse/FLINK-18087 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.0, 1.12.0 >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Blocker > Fix For: 1.11.0, 1.12.0 > > > In FLINK-17632, we add the support remote user jar. However, uploading user > artifact for Yarn job cluster is broken exceptionally. In the following code, > we should only upload local files. Now it has the contrary behavior. > {code:java} > // only upload local files > if (Utils.isRemotePath(entry.getValue().filePath)) { >Path localPath = new Path(entry.getValue().filePath); >Tuple2 remoteFileInfo = > fileUploader.uploadLocalFileToRemote(localPath, entry.getKey()); >jobGraph.setUserArtifactRemotePath(entry.getKey(), > remoteFileInfo.f0.toString()); > } > {code} > > Another problem is the related tests {{testPerJobModeWithDistributedCache}} > does not fail because we do not fetch the artifact from blob server. We > directly get it from local file. It also needs to be enhanced. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol commented on pull request #12447: [FLINK-18069][CI] Test if Java/Scaladocs builds are passing in the compile stage
zentol commented on pull request #12447: URL: https://github.com/apache/flink/pull/12447#issuecomment-638130993 Time-wise this seems fine; virtually no change for the javadocs, and scaladoc only takes 20 seconds. It would be nice if we could write the scaladoc warnings into a file though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] AHeise commented on pull request #12441: [FLINK-17918][table-blink] Fix AppendOnlyTopNFunction shouldn't mutate list value of MapState
AHeise commented on pull request #12441: URL: https://github.com/apache/flink/pull/12441#issuecomment-638137845 Okay I investigated `org.apache.flink.table.planner.runtime.stream.sql.OverWindowITCase#testRowTimeUnBoundedPartitionedRowsOver2`. The issue in general is that the watermark is not coming from the source but from `EventTimeProcessOperator`. Since, it is not storing the watermark, the first watermark after recovery will be emitted quite late after the first element have been processed. Then some late elements may be processed before that and actually be emitted even though they shouldn't (since watermark is effectively 0). Since this more related to the test setup than anything else, I'd recommend to store the last watermark in `EventTimeProcessOperator` during checkpointing and emit on recovery. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12459: [FLINK-17959][python] Fix the 'call already cancelled' exception when executing Python UDF
flinkbot edited a comment on pull request #12459: URL: https://github.com/apache/flink/pull/12459#issuecomment-638086087 ## CI report: * 819a3d585cce9c827212ea3cd33eb05ad84ce006 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2628) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-17974) Test new Flink Docker image
[ https://issues.apache.org/jira/browse/FLINK-17974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-17974: -- Parent: FLINK-18088 Issue Type: Sub-task (was: Task) > Test new Flink Docker image > --- > > Key: FLINK-17974 > URL: https://issues.apache.org/jira/browse/FLINK-17974 > Project: Flink > Issue Type: Sub-task > Components: Deployment / Docker >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Priority: Critical > Labels: release-testing > Fix For: 1.11.0 > > > Test Flink's new Docker image and the corresponding Dockerfile: > * Try to build custom image > * Try to run different Flink processes (Master (session, per-job), > TaskManager) > * Try custom configuration and log properties -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17978) Test Hadoop dependency change
[ https://issues.apache.org/jira/browse/FLINK-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-17978: -- Parent: FLINK-18088 Issue Type: Sub-task (was: Task) > Test Hadoop dependency change > - > > Key: FLINK-17978 > URL: https://issues.apache.org/jira/browse/FLINK-17978 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Priority: Critical > Labels: release-testing > Fix For: 1.11.0 > > > Test the Hadoop dependency change: > * Run Flink with HBase/ORC (maybe add e2e test) > * Validate meaningful exception message if Hadoop dependency is missing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17976) Test native K8s integration
[ https://issues.apache.org/jira/browse/FLINK-17976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-17976: -- Parent: FLINK-18088 Issue Type: Sub-task (was: Task) > Test native K8s integration > --- > > Key: FLINK-17976 > URL: https://issues.apache.org/jira/browse/FLINK-17976 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Priority: Critical > Labels: release-testing > Fix For: 1.11.0 > > > Test Flink's native K8s integration: > * session mode > * application mode > * custom Flink image > * custom configuration and log properties -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17977) Check log sanity
[ https://issues.apache.org/jira/browse/FLINK-17977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-17977: -- Parent: FLINK-18088 Issue Type: Sub-task (was: Task) > Check log sanity > > > Key: FLINK-17977 > URL: https://issues.apache.org/jira/browse/FLINK-17977 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Priority: Critical > Labels: release-testing > Fix For: 1.11.0 > > > Run a normal Flink workload (e.g. job with fixed number of failures on > session cluster) and check that the produced Flink logs make sense and don't > contain confusing statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17975) Test ZooKeeper 3.5 support
[ https://issues.apache.org/jira/browse/FLINK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-17975: -- Parent: FLINK-18088 Issue Type: Sub-task (was: Task) > Test ZooKeeper 3.5 support > -- > > Key: FLINK-17975 > URL: https://issues.apache.org/jira/browse/FLINK-17975 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.11.0 >Reporter: Till Rohrmann >Priority: Critical > Labels: release-testing > Fix For: 1.11.0 > > > Setup a Flink cluster with ZooKeeper 3.5 and run some HA tests on it (killing > processes, failing jobs). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18068) Job scheduling stops but not exits after throwing non-fatal exception
[ https://issues.apache.org/jira/browse/FLINK-18068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124685#comment-17124685 ] Till Rohrmann edited comment on FLINK-18068 at 6/3/20, 7:25 AM: I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. I think the proper solution would be that per default we propagate all thrown exceptions. At places where user code exceptions can occur and if they are tolerable, the higher level code should catch it and handle it properly. was (Author: till.rohrmann): I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. > Job scheduling stops but not exits after throwing non-fatal exception > - > > Key: FLINK-18068 > URL: https://issues.apache.org/jira/browse/FLINK-18068 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.10.1 >Reporter: Jiayi Liao >Priority: Major > > The batch job will stop but still be alive with doing nothing for a long time > (maybe forever?) if any non fatal exception is thrown from interacting with > YARN. Here is the example : > {code:java} > java.lang.IllegalStateException: The RMClient's and YarnResourceManagers > internal state about the number of pending container requests for resource > has diverged. Number client's pending container > requests 40 != Number RM's pending container requests 0. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:217) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.getPendingRequestsAndCheckConsistency(YarnResourceManager.java:518) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.onContainersOfResourceAllocated(YarnResourceManager.java:431) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:395) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > [flink-ljy-1.0.jar:?] > at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > [flink-ljy-1.0.jar:?] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > [flink-ljy-1.0.jar:?] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > [flink-ljy-1.0.jar:?] > at akka.actor.Actor$class.aroundReceive(Actor.scala:517) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.actor.ActorCell.invoke(ActorCell.scala:561) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at
[GitHub] [flink] wuchong commented on a change in pull request #12427: [FLINK-16681][jdbc] Fix the bug that jdbc lost connection after a lon…
wuchong commented on a change in pull request #12427: URL: https://github.com/apache/flink/pull/12427#discussion_r434371368 ## File path: flink-connectors/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/internal/executor/InsertOrUpdateJdbcExecutor.java ## @@ -83,6 +88,21 @@ public void open(Connection connection) throws SQLException { updateStatement = connection.prepareStatement(updateSQL); } + @Override + public void reopen(Connection connection) throws SQLException { + try { + existStatement.close(); + insertStatement.close(); + updateStatement.close(); + } catch (SQLException e) { + LOG.info("PreparedStatement close failed.", e); + } + + existStatement = connection.prepareStatement(existSQL); + insertStatement = connection.prepareStatement(insertSQL); + updateStatement = connection.prepareStatement(updateSQL); Review comment: I think we don't need to introduce a new `reopen` interface, the implementation is always the combination of `close` and `open`. You can just call `open` and then `close`. But maybe we need to move the construction of `batch` map into constuctor. This is also better because we can mark `batch` as `final` then. We can also rename `open` to `openConnection` and `close` to `closeConnection` to make the behavior more explicitly. ## File path: flink-connectors/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/internal/JdbcBatchingOutputFormat.java ## @@ -175,6 +176,15 @@ public synchronized void flush() throws IOException { if (i >= executionOptions.getMaxRetries()) { throw new IOException(e); } + try { + if (!connection.isValid(JdbcLookupFunction.CONNECTION_CHECK_TIMEOUT)) { + connection = connectionProvider.reestablishConnection(); + jdbcStatementExecutor.reopen(connection); + } + } catch (Exception excpetion) { + LOG.error("JDBC connection is not valid, and reestablish connection failed.", excpetion); + throw new RuntimeException("Reestablish JDBC connection failed", excpetion); Review comment: Use `IOException` instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] AHeise commented on pull request #12441: [FLINK-17918][table-blink] Fix AppendOnlyTopNFunction shouldn't mutate list value of MapState
AHeise commented on pull request #12441: URL: https://github.com/apache/flink/pull/12441#issuecomment-638038993 The actual changes LGTM. However, Azure showed an error that was probably found by the different test setup. I'll investigate what's going on (I have a good test setup for these issues now). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] danny0405 commented on a change in pull request #12355: [FLINK-17893] [sql-client] SQL CLI should print the root cause if the statement is invalid
danny0405 commented on a change in pull request #12355: URL: https://github.com/apache/flink/pull/12355#discussion_r434415772 ## File path: flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/SqlCommandParser.java ## @@ -253,8 +247,10 @@ private SqlCommandParser() { // supports both `explain xx` and `explain plan for xx` now // TODO should keep `explain xx` ? EXPLAIN( - "EXPLAIN\\s+(.*)", - SINGLE_OPERAND), + "EXPLAIN\\s+(SELECT|INSERT)\\s+(.*)", + (operands) -> { + return Optional.of(new String[] { operands[0], operands[1] }); Review comment: Add some comment to explain why to match the `SELECT` and `INSERT` keyword explicitly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12369: [FLINK-17678][Connectors/HBase]Support fink-sql-connector-hbase
flinkbot edited a comment on pull request #12369: URL: https://github.com/apache/flink/pull/12369#issuecomment-635090403 ## CI report: * 08539af1e82e9ef40d9814fc4c787c9f66d3f33e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2423) * 923273e1eed813d62522fb270c250d207ce2a389 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] danny0405 commented on a change in pull request #12355: [FLINK-17893] [sql-client] SQL CLI should print the root cause if the statement is invalid
danny0405 commented on a change in pull request #12355: URL: https://github.com/apache/flink/pull/12355#discussion_r434415304 ## File path: flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java ## @@ -253,7 +253,13 @@ public boolean submitUpdate(String statement) { // private Optional parseCommand(String line) { - final Optional parsedLine = SqlCommandParser.parse(executor.getSqlParser(sessionId), line); + final Optional parsedLine; + try { + parsedLine = SqlCommandParser.parse(executor.getSqlParser(sessionId), line); + } catch (SqlExecutionException e) { + printExecutionException(e); + return Optional.empty(); + } if (!parsedLine.isPresent()) { printError(CliStrings.MESSAGE_UNKNOWN_SQL); Review comment: We may never run into this code branch now ~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zjuwangg commented on a change in pull request #12028: [FLINK-17553][table]fix plan error when constant exists in group window key
zjuwangg commented on a change in pull request #12028: URL: https://github.com/apache/flink/pull/12028#discussion_r434415210 ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/sql/WindowAggregateITCase.scala ## @@ -319,6 +319,48 @@ class WindowAggregateITCase(mode: StateBackendMode) assertEquals(expected.sorted, sink.getRetractResults.sorted) } + // used to verify compile works normally when constants exists in group window key (FLINK-17553) + @Test + def testWindowAggregateOnConstantValue(): Unit = { +val ddl1 = + """ +|CREATE TABLE src ( +| log_ts STRING, +| ts TIMESTAMP(3), +| a INT, +| b DOUBLE, +| rowtime AS CAST(log_ts AS TIMESTAMP(3)), +| WATERMARK FOR rowtime AS rowtime - INTERVAL '0.001' SECOND +|) WITH ( +| 'connector' = 'COLLECTION', +| 'is-bounded' = 'false' +|) + """.stripMargin +val ddl2 = + """ +|CREATE TABLE dst ( +| ts TIMESTAMP(3), +| a BIGINT, +| b DOUBLE +|) WITH ( +| 'connector.type' = 'filesystem', +| 'connector.path' = '/tmp/1', +| 'format.type' = 'csv' +|) + """.stripMargin +val query = + """ +|INSERT INTO dst +|SELECT TUMBLE_END(rowtime, INTERVAL '0.003' SECOND), COUNT(ts), SUM(b) +|FROM src +| GROUP BY 'a', TUMBLE(rowtime, INTERVAL '0.003' SECOND) + """.stripMargin +tEnv.sqlUpdate(ddl1) +tEnv.sqlUpdate(ddl2) +tEnv.sqlUpdate(query) +tEnv.explain(true) Review comment: Agreed. Thanks for you suggestion. Here I just want to verify that explain work normally instead of checking the result, but I don't find a proper test file to place it. Which file would you suggest me to place? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] klion26 commented on pull request #342: [FLINK-17926] Fix the build problem of docker image
klion26 commented on pull request #342: URL: https://github.com/apache/flink-web/pull/342#issuecomment-638081635 delete `.rubydeps` works for me, thanks for the fix. I'm fine to remove the `./docker` directory and using the command as the only official way to build the Flink website. I'll push another commit to remove the `./docker` directory. `docker run --rm --volume="$PWD:/srv/flink-web" --expose=4000 -p 4000:4000 -it ruby:2.5 bash -c 'cd /srv/flink-web && ./build.sh -p'` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] klion26 commented on a change in pull request #12143: [FLINK-17632][yarn] Support to specify a remote path for job jar
klion26 commented on a change in pull request #12143: URL: https://github.com/apache/flink/pull/12143#discussion_r434449857 ## File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java ## @@ -733,67 +737,49 @@ private ApplicationReport startAppMaster( 1)); } - final Set userJarFiles = new HashSet<>(); + final Set userJarFiles = new HashSet<>(); if (jobGraph != null) { - userJarFiles.addAll(jobGraph.getUserJars().stream().map(f -> f.toUri()).map(File::new).collect(Collectors.toSet())); + userJarFiles.addAll(jobGraph.getUserJars().stream().map(f -> f.toUri()).map(Path::new).collect(Collectors.toSet())); } final List jarUrls = ConfigUtils.decodeListFromConfig(configuration, PipelineOptions.JARS, URI::create); if (jarUrls != null && YarnApplicationClusterEntryPoint.class.getName().equals(yarnClusterEntrypoint)) { - userJarFiles.addAll(jarUrls.stream().map(File::new).collect(Collectors.toSet())); + userJarFiles.addAll(jarUrls.stream().map(Path::new).collect(Collectors.toSet())); } - int yarnFileReplication = yarnConfiguration.getInt(DFSConfigKeys.DFS_REPLICATION_KEY, DFSConfigKeys.DFS_REPLICATION_DEFAULT); - int fileReplication = configuration.getInteger(YarnConfigOptions.FILE_REPLICATION); - fileReplication = fileReplication > 0 ? fileReplication : yarnFileReplication; - // only for per job mode if (jobGraph != null) { for (Map.Entry entry : jobGraph.getUserArtifacts().entrySet()) { - org.apache.flink.core.fs.Path path = new org.apache.flink.core.fs.Path(entry.getValue().filePath); // only upload local files - if (!path.getFileSystem().isDistributedFS()) { - Path localPath = new Path(path.getPath()); + if (Utils.isRemotePath(entry.getValue().filePath)) { Review comment: Here the comment said that `only upload local files`, but the logic in `if` will be true only for `RemotePath`, so maybe this is a bug? @wangyang0918 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] krasinski opened a new pull request #12462: [FLINK-18045] Fix Kerberos credentials checking
krasinski opened a new pull request #12462: URL: https://github.com/apache/flink/pull/12462 ## What is the purpose of the change This pull request brings back checking if the enabled security method is Kerberos, that's an old bug reintroduced during some refactoring. Code assuming that the enabled security method is Kerberos caused problems on MapR secured clusters when deploying on YARN. ## Brief change log - Checking if the auth method is Kerberos was added to `org.apache.flink.runtime.util.HadoopUtils.isCredentialsConfigured` method. ## Verifying this change This change added tests and can be verified as follows: - Added a test that validates that the modified method doesn't check for Kerberos credentials when auth method is other than Kerberos ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: yes - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-18086) Migrate SQLClientKafkaITCase to use DDL and new options to create tables
Jark Wu created FLINK-18086: --- Summary: Migrate SQLClientKafkaITCase to use DDL and new options to create tables Key: FLINK-18086 URL: https://issues.apache.org/jira/browse/FLINK-18086 Project: Flink Issue Type: Test Components: Connectors / Kafka, Table SQL / Ecosystem Reporter: Jark Wu The existing SQLClientKafkaITCase uses YAML to register old connectors which can't cover new connector and format implementations. However, we got many reported bugs related to classloading, class shading about the new connectors. Currently, SQL CLI doesn't support muli-line statement, which means it's not easy to migrate to DDL. One idea is that we can instroduced a persisted catalog for this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #12462: [FLINK-18045] Fix Kerberos credentials checking
flinkbot edited a comment on pull request #12462: URL: https://github.com/apache/flink/pull/12462#issuecomment-638102486 ## CI report: * e5fa4096bd14b96de1cedffa35b74ac31a585f6e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2634) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434473101 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. Review comment: ```suggestion perform streaming data visualization via Flink on Zeppelin and how to use Apache Flink UDFs in Zeppelin. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12454: [FLINK-17091][Arvo] Adapt Avro record conversion to new timestamp bridged classes
flinkbot edited a comment on pull request #12454: URL: https://github.com/apache/flink/pull/12454#issuecomment-637997814 ## CI report: * 498d42db35e412ad834d8b0c70d6404d0de4c339 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2605) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434473674 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. Review comment: ```suggestion With [Zeppelin](https://zeppelin.apache.org/), you can build a real time streaming dashboard without writing any line of javascript/html/css code. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12460: [FLINK-18063][checkpointing] Fix the race condition for aborting current checkpoint in CheckpointBarrierUnaligner
flinkbot edited a comment on pull request #12460: URL: https://github.com/apache/flink/pull/12460#issuecomment-638102290 ## CI report: * c3f3aca669da2619d58190e684479c3fadd7026f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2632) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12461: [FLINK-17994][checkpointing] Fix the race condition between CheckpointBarrierUnaligner#processBarrier and #notifyBarrierReceived
flinkbot edited a comment on pull request #12461: URL: https://github.com/apache/flink/pull/12461#issuecomment-638102391 ## CI report: * 471535fc849c7dbf5bb9b532b5848a43ccdedafb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2633) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12456: [FLINK-17113][sql-cli] Refactor view support in SQL Client
flinkbot edited a comment on pull request #12456: URL: https://github.com/apache/flink/pull/12456#issuecomment-638032908 ## CI report: * b5b8172d4ccddf197a084d38e6dcef13ee27da8e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2614) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12455: [FLINK-17935] Move setting yarn.log-config-file to YarnClusterClientFactory
flinkbot edited a comment on pull request #12455: URL: https://github.com/apache/flink/pull/12455#issuecomment-638007072 ## CI report: * bc3ca72ad94196d72570aa7a58e9aa5c411e2733 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2609) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a change in pull request #12447: [FLINK-18069][CI] Test if Java/Scaladocs builds are passing in the compile stage
zentol commented on a change in pull request #12447: URL: https://github.com/apache/flink/pull/12447#discussion_r434491456 ## File path: tools/ci/compile.sh ## @@ -43,39 +43,39 @@ echo "== EXIT_CODE=0 run_mvn clean install $MAVEN_OPTS -Dflink.convergence.phase=install -Pcheck-convergence -Dflink.forkCount=2 \ --Dflink.forkCountTestPackage=2 -Dmaven.javadoc.skip=true -U -DskipTests Review comment: it may make sense to instead build the javadocs exactly like we do on buildbot: `mvn javadoc:aggregate -Paggregate-scaladoc -DadditionalJOption="-Xdoclint:none" -Dmaven.javadoc.failOnError=false -Dcheckstyle.skip=true -Denforcer.skip=true -Dheader="http://flink.apache.org/\; target=\"_top\">Back to Flink Website"` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18087) Uploading user artifact for Yarn job cluster could not work
[ https://issues.apache.org/jira/browse/FLINK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-18087: -- Description: In FLINK-17632, we add the support remote user jar. However, uploading user artifact for Yarn job cluster is broken exceptionally. In the following code, we should only upload local files. Now it has the contrary behavior. {code:java} // only upload local files if (Utils.isRemotePath(entry.getValue().filePath)) { Path localPath = new Path(entry.getValue().filePath); Tuple2 remoteFileInfo = fileUploader.uploadLocalFileToRemote(localPath, entry.getKey()); jobGraph.setUserArtifactRemotePath(entry.getKey(), remoteFileInfo.f0.toString()); } {code} Another problem is the related tests {{testPerJobModeWithDistributedCache}} does not fail because we do not fetch the artifact from remote filesystem(i.e. HDFS). We directly get it from local file. It also needs to be enhanced. was: In FLINK-17632, we add the support remote user jar. However, uploading user artifact for Yarn job cluster is broken exceptionally. In the following code, we should only upload local files. Now it has the contrary behavior. {code:java} // only upload local files if (Utils.isRemotePath(entry.getValue().filePath)) { Path localPath = new Path(entry.getValue().filePath); Tuple2 remoteFileInfo = fileUploader.uploadLocalFileToRemote(localPath, entry.getKey()); jobGraph.setUserArtifactRemotePath(entry.getKey(), remoteFileInfo.f0.toString()); } {code} Another problem is the related tests {{testPerJobModeWithDistributedCache}} does not fail because we do not fetch the artifact from blob server. We directly get it from local file. It also needs to be enhanced. > Uploading user artifact for Yarn job cluster could not work > --- > > Key: FLINK-18087 > URL: https://issues.apache.org/jira/browse/FLINK-18087 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.0, 1.12.0 >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > In FLINK-17632, we add the support remote user jar. However, uploading user > artifact for Yarn job cluster is broken exceptionally. In the following code, > we should only upload local files. Now it has the contrary behavior. > {code:java} > // only upload local files > if (Utils.isRemotePath(entry.getValue().filePath)) { >Path localPath = new Path(entry.getValue().filePath); >Tuple2 remoteFileInfo = > fileUploader.uploadLocalFileToRemote(localPath, entry.getKey()); >jobGraph.setUserArtifactRemotePath(entry.getKey(), > remoteFileInfo.f0.toString()); > } > {code} > > Another problem is the related tests {{testPerJobModeWithDistributedCache}} > does not fail because we do not fetch the artifact from remote > filesystem(i.e. HDFS). We directly get it from local file. It also needs to > be enhanced. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] twalthr opened a new pull request #12464: [FLINK-13784][table] Implement type inference for basic math functions
twalthr opened a new pull request #12464: URL: https://github.com/apache/flink/pull/12464 ## What is the purpose of the change This adds the new type inference to PLUS, MINUS, TIMES, DIVIDE, UNARY_MINUS, MOD, ROUND, FLOOR, CEIL. It fixes issues at different locations. We might need to investigate the plan changes of `ValuesTest`. ## Brief change log See commit messages. ## Verifying this change This change is already covered by existing tests. But also adds `MathFunctionsITCase`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? JavaDocs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys closed pull request #12434: [FLINK-18052] Increase timeout for ES Search API in IT Cases
dawidwys closed pull request #12434: URL: https://github.com/apache/flink/pull/12434 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18050) Fix the bug of recycling buffer twice once exception in ChannelStateWriteRequestDispatcher#dispatch
[ https://issues.apache.org/jira/browse/FLINK-18050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-18050: --- Labels: pull-request-available (was: ) > Fix the bug of recycling buffer twice once exception in > ChannelStateWriteRequestDispatcher#dispatch > --- > > Key: FLINK-18050 > URL: https://issues.apache.org/jira/browse/FLINK-18050 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Zhijiang >Assignee: Roman Khachatryan >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > When task finishes, the `CheckpointBarrierUnaligner` will decline the current > checkpoint, which would write abort request into `ChannelStateWriter`. > The abort request will be executed before other write output request in the > queue, and close the underlying `CheckpointStateOutputStream`. Then when the > dispatcher executes the next write output request to access the stream, it > will throw ClosedByInterruptException to make dispatcher thread exit. > In this process, the underlying buffers for current write output request will > be recycled twice. > * ChannelStateCheckpointWriter#write will recycle all the buffers in finally > part, which can cover both exception and normal cases. > * ChannelStateWriteRequestDispatcherImpl#dispatch will call > `request.cancel(e)` to recycle the underlying buffers again in the case of > exception. > The effect of this bug can cause further exception in the network shuffle > process, which references the same buffer as above, then this exception will > send to the downstream side to make it failure. > > This bug can be reproduced easily via running > UnalignedCheckpointITCase#shouldPerformUnalignedCheckpointOnParallelRemoteChannel. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #11397: [FLINK-16217] [sql-client] catch SqlExecutionException for all callXX methods
flinkbot edited a comment on pull request #11397: URL: https://github.com/apache/flink/pull/11397#issuecomment-598575354 ## CI report: * cfb1c83e34488d65854df603907e72b7de784564 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2238) * 209b426a5818e8a230bb63499a921398170281fc Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2606) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan opened a new pull request #12457: [FLINK-18050][task][checkpointing] Fix double buffer recycling
rkhachatryan opened a new pull request #12457: URL: https://github.com/apache/flink/pull/12457 ## What is the purpose of the change Fix double buffer recycling in `ChannelStateWriteRequest.execute` and then in `cancel`. ## Verifying this change Added unit test `ChannelStateWriteRequestDispatcherImplTest` (`testPartialInputChannelStateWrite`, `testPartialResultSubpartitionStateWrite`). ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] azagrebin commented on a change in pull request #12446: [FLINK-16225] Implement user class loading exception handler
azagrebin commented on a change in pull request #12446: URL: https://github.com/apache/flink/pull/12446#discussion_r434377343 ## File path: azure-pipelines.yml ## @@ -61,7 +61,107 @@ variables: jobs: - template: tools/azure-pipelines/jobs-template.yml parameters: # see template file for a definition of the parameters. - stage_name: ci_build Review comment: Thanks, I adjusted it as well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-18033) Improve e2e test execution time
[ https://issues.apache.org/jira/browse/FLINK-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124737#comment-17124737 ] Robert Metzger edited comment on FLINK-18033 at 6/3/20, 8:14 AM: - Thanks a lot for opening this ticket. I agree that the e2e tests are running for quite a while, and I'm pretty sure that there's a lot of room for improvement in the scripts. I see two ways forward (which are orthogonal to each other) A) Investigate the slowest e2e tests and potential inefficiencies in the common scripts (cluster startup and shutdown etc.). The goal should not be to optimize the last few seconds, but rather the cases where we are wasting minutes of time waiting. B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler" We have the following CI environments: 1. Flink: Servers providing an environment to run tests on Docker; VMs from Azure to run tests on 'bare metal' 2. personal Azure accounts: Only VMs from Azure At the moment, we can not execute the e2e tests requiring Docker or Kubernetes in the Docker environment (need more research into that) In the past, we've manually split the e2e tests into separate scripts. This has lead to "code duplication" and a perceived complexity in the e2e tests. Given these constraints, I imagine the following design of a "e2e test scheduler". 1. Registration of tests: Tests can have the following properties: - bare_metal = require a bare metal environment (azure vm) - light = have a resource footprint below 1GB of memory and low CPU usage (< 10%) - heavy = have a resource footprint up to 8GB of memory and high CPU usage {code:bash} register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs register_test properties=bare_metal test_kubernetes_embedded_job.sh register_test properties=light test_quickstarts.sh java register_test properties=heavy test_tpch.sh {code} (Note: we could determine the light / heavyness of tests using a tool like: https://github.com/astrofrog/psrecord) 2. Execution: In each environment, we run tests through a call like: {code:bash} execute_test properties=bare_metal # on the azure VMs execute_test properties=light paralleism=8. # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently execute_test properties=heavy paralleism=1. # on the dockerized environment for Flink, otherwise on Azure VM. Run 1 tests concurrently {code} For now, the Java e2e tests are executed outside of the scheduler. was (Author: rmetzger): Thanks a lot for opening this ticket. I agree that the e2e tests are running for quite a while, and I'm pretty sure that there's a lot of room for improvement in the scripts. I see two ways forward (which are orthogonal to each other) A) Investigate the slowest e2e tests and potential inefficiencies in the common scripts (cluster startup and shutdown etc.). The goal should not be to optimize the last few seconds, but rather the cases where we are wasting minutes of time waiting. B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler" We have the following CI environments: 1. Flink: Servers providing an environment to run tests on Docker; VMs from Azure to run tests on 'bare metal' 2. personal Azure accounts: Only VMs from Azure At the moment, we can not execute the e2e tests requiring Docker or Kubernetes in the Docker environment (need more research into that) In the past, we've manually split the e2e tests into separate scripts. This has lead to "code duplication" and a perceived complexity in the e2e tests. Given these constraints, I imagine the following design of a "e2e test scheduler". 1. Registration of tests: Tests can have the following properties: - bare_metal = require a bare metal environment (azure vm) - light = have a resource footprint below 1GB of memory and low CPU usage (< 10%) - heavy = have a resource footprint up to 8GB of memory and high CPU usage {code:bash} register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs register_test properties=bare_metal test_kubernetes_embedded_job.sh register_test properties=light test_quickstarts.sh java register_test properties=heavy test_tpch.sh {code} (Note: we could determine the light / heavyness of tests using a tool like: https://github.com/astrofrog/psrecord) 2. Execution: In each environment, we run tests through a call like: {code:bash} execute_test properties=bare_metal # on the azure VMs execute_test properties=light paralleism=8. # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently execute_test properties=heavy paralleism=1. # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently {code} For now, the Java e2e tests are executed outside of the scheduler. > Improve e2e test execution time > --- > >
[jira] [Commented] (FLINK-18033) Improve e2e test execution time
[ https://issues.apache.org/jira/browse/FLINK-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124737#comment-17124737 ] Robert Metzger commented on FLINK-18033: Thanks a lot for opening this ticket. I agree that the e2e tests are running for quite a while, and I'm pretty sure that there's a lot of room for improvement in the scripts. I see two ways forward (which are orthogonal to each other) A) Investigate the slowest e2e tests and potential inefficiencies in the common scripts (cluster startup and shutdown etc.). The goal should not be to optimize the last few seconds, but rather the cases where we are wasting minutes of time waiting. B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler" We have the following CI environments: 1. Flink: Servers providing an environment to run tests on Docker; VMs from Azure to run tests on 'bare metal' 2. personal Azure accounts: Only VMs from Azure At the moment, we can not execute the e2e tests requiring Docker or Kubernetes in the Docker environment (need more research into that) In the past, we've manually split the e2e tests into separate scripts. This has lead to "code duplication" and a perceived complexity in the e2e tests. Given these constraints, I imagine the following design of a "e2e test scheduler". 1. Registration of tests: Tests can have the following properties: - bare_metal = require a bare metal environment (azure vm) - light = have a resource footprint below 1GB of memory and low CPU usage (< 10%) - heavy = have a resource footprint up to 8GB of memory and high CPU usage {code:bash} register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs register_test properties=bare_metal test_kubernetes_embedded_job.sh register_test properties=light test_quickstarts.sh java register_test properties=heavy test_tpch.sh {code} (Note: we could determine the light / heavyness of tests using a tool like: https://github.com/astrofrog/psrecord) 2. Execution: In each environment, we run tests through a call like: {code:bash} execute_test properties=bare_metal# on the azure VMs execute_test properties=light paralleism=8# on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently execute_test properties=heavy paralleism=1 # on the dockerized environment for Flink, otherwise on Azure VM. Run 8 tests concurrently {code} For now, the Java e2e tests are executed outside of the scheduler. > Improve e2e test execution time > --- > > Key: FLINK-18033 > URL: https://issues.apache.org/jira/browse/FLINK-18033 > Project: Flink > Issue Type: Improvement > Components: Build System / Azure Pipelines, Test Infrastructure, > Tests >Reporter: Chesnay Schepler >Priority: Major > > Running all e2e tests currently requires ~3.5h, and this time is growing. > We should look into ways to bring this time down to improve feedback times. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16627) Support only generate non-null values when serializing into JSON
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-16627: --- Issue Type: New Feature (was: Improvement) > Support only generate non-null values when serializing into JSON > > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: New Feature > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Planner >Affects Versions: 1.10.0 >Reporter: jackray wang >Assignee: jackray wang >Priority: Major > Fix For: 1.11.0 > > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-7151) Support function DDL
[ https://issues.apache.org/jira/browse/FLINK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-7151: -- Fix Version/s: (was: 1.11.0) > Support function DDL > > > Key: FLINK-7151 > URL: https://issues.apache.org/jira/browse/FLINK-7151 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: yuemeng >Assignee: Zhenqiu Huang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Based on create function and table.we can register a udf,udaf,udtf use sql: > {code} > CREATE FUNCTION [IF NOT EXISTS] [catalog_name.db_name.]function_name AS > class_name; > DROP FUNCTION [IF EXISTS] [catalog_name.db_name.]function_name; > ALTER FUNCTION [IF EXISTS] [catalog_name.db_name.]function_name RENAME TO > new_name; > {code} > {code} > CREATE function 'TOPK' AS > 'com..aggregate.udaf.distinctUdaf.topk.ITopKUDAF'; > INSERT INTO db_sink SELECT id, TOPK(price, 5, 'DESC') FROM kafka_source GROUP > BY id; > {code} > This ticket can assume that the function class is already loaded in classpath > by users. Advanced syntax like to how to dynamically load udf libraries from > external locations can be on a separate ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17774) supports all kinds of changes for select result
[ https://issues.apache.org/jira/browse/FLINK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] godfrey he updated FLINK-17774: --- Fix Version/s: (was: 1.11.0) 1.12.0 > supports all kinds of changes for select result > --- > > Key: FLINK-17774 > URL: https://issues.apache.org/jira/browse/FLINK-17774 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Legacy Planner, Table SQL / Planner >Reporter: godfrey he >Assignee: godfrey he >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > [FLINK-17252|https://issues.apache.org/jira/browse/FLINK-17252] has supported > select query, however only append change is supported. because > [FLINK-16998|https://issues.apache.org/jira/browse/FLINK-16998] is not > finished. This issue aims to support all kinds of changes based on > FLINK-16998. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-15585) Improve function identifier string in plan digest
[ https://issues.apache.org/jira/browse/FLINK-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-15585: --- Fix Version/s: (was: 1.11.0) 1.12.0 > Improve function identifier string in plan digest > - > > Key: FLINK-15585 > URL: https://issues.apache.org/jira/browse/FLINK-15585 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: Jark Wu >Assignee: godfrey he >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, we are using {{UserDefinedFunction#functionIdentifier}} as the > identifier string of UDFs in plan digest, for example: > {code:java} > LogicalTableFunctionScan(invocation=[org$apache$flink$table$planner$utils$TableFunc1$8050927803993624f40152a838c98018($2)], > rowType=...) > {code} > However, the result of {{UserDefinedFunction#functionIdentifier}} will change > if we just add a method in UserDefinedFunction, because it uses Java > serialization. Then we have to update 60 plan tests which is very annoying. > In the other hand, displaying the function identifier string in operator name > in Web UI is verbose to users. > In order to improve this situation, there are something we can do: > 1) If the UDF has a catalog function name, we can just use the catalog name > as the digest. Otherwise, fallback to (2). > 2) If the UDF doesn't contain fields, we just use the full calss name as the > digest. Otherwise, fallback to (3). > 3) Use identifier string which will do the full serialization. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16175) Add config option to switch case sensitive for column names in SQL
[ https://issues.apache.org/jira/browse/FLINK-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-16175: --- Issue Type: New Feature (was: Improvement) > Add config option to switch case sensitive for column names in SQL > -- > > Key: FLINK-16175 > URL: https://issues.apache.org/jira/browse/FLINK-16175 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API, Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available, usability > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Flink SQL is default CaseSensitive and have no option to config. This issue > aims to support > a configOption so that user can set CaseSensitive for their SQL. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16627) Support only generate non-null values when serializing into JSON
[ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-16627: --- Fix Version/s: (was: 1.11.0) 1.12.0 > Support only generate non-null values when serializing into JSON > > > Key: FLINK-16627 > URL: https://issues.apache.org/jira/browse/FLINK-16627 > Project: Flink > Issue Type: New Feature > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Planner >Affects Versions: 1.10.0 >Reporter: jackray wang >Assignee: jackray wang >Priority: Major > Fix For: 1.12.0 > > > {code:java} > //sql > CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //sql > CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……) > {code} > > {code:java} > //scala udf > class ScalaUpper extends ScalarFunction { > def eval(str: String) : String= { >if(str == null){ >return "" >}else{ >return str >} > } > > } > btenv.registerFunction("scala_upper", new ScalaUpper()) > {code} > > {code:java} > //sql > insert into sink_kafka select subtype, scala_upper(svt) from source_kafka > {code} > > > > Sometimes the svt's value is null, inert into kafkas json like > \{"subtype":"qin","svt":null} > If the amount of data is small, it is acceptable,but we process 10TB of data > every day, and there may be many nulls in the json, which affects the > efficiency. If you can add a parameter to remove the null key when defining a > sinktable, the performance will be greatly improved > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18079) KafkaShuffle Manual Tests
[ https://issues.apache.org/jira/browse/FLINK-18079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuan Mei updated FLINK-18079: - Labels: release-testing (was: ) > KafkaShuffle Manual Tests > - > > Key: FLINK-18079 > URL: https://issues.apache.org/jira/browse/FLINK-18079 > Project: Flink > Issue Type: Improvement > Components: API / DataStream, Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Yuan Mei >Priority: Major > Labels: release-testing > Fix For: 1.11.0 > > > Manual Tests and Results to demonstrate KafkaShuffle working as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #12459: [FLINK-17959][python] Fix the 'call already cancelled' exception when executing Python UDF
flinkbot commented on pull request #12459: URL: https://github.com/apache/flink/pull/12459#issuecomment-638086087 ## CI report: * 819a3d585cce9c827212ea3cd33eb05ad84ce006 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12435: [FLINK-18059] [sql-client] Fix create/drop catalog statement can not be executed in sql client
flinkbot edited a comment on pull request #12435: URL: https://github.com/apache/flink/pull/12435#issuecomment-637372360 ## CI report: * 93c39c72dea164002f43ad015b55aa200b3c8d7d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2549) * b5ef005c90fb95a63c97d06c488adf5e4abe2972 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2626) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12436: [FLINK-17847][table sql / planner] ArrayIndexOutOfBoundsException happens in StreamExecCalc operator
flinkbot edited a comment on pull request #12436: URL: https://github.com/apache/flink/pull/12436#issuecomment-637399261 ## CI report: * e98155d0d16e4c8aff16f018a349587cdadadede Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2558) * ebc9048a44859813911595912641e087d8ce4448 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2612) * d553feca300e2668e3e5c16c1c524d73acde141c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2627) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-17830) Add documentation for the new HBase connector
[ https://issues.apache.org/jira/browse/FLINK-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-17830. --- Resolution: Fixed - master (1.12.0): 42c0837fab961fe2c8ed9ab2da7b78cbda66b811 - 1.11.0: 5c1579fc0a801c8ec92212bc8c037581267ce281 > Add documentation for the new HBase connector > - > > Key: FLINK-17830 > URL: https://issues.apache.org/jira/browse/FLINK-17830 > Project: Flink > Issue Type: Sub-task > Components: Connectors / HBase, Documentation >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Blocker > Fix For: 1.11.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-15467) Should wait for the end of the source thread during the Task cancellation
[ https://issues.apache.org/jira/browse/FLINK-15467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124822#comment-17124822 ] Piotr Nowojski edited comment on FLINK-15467 at 6/3/20, 10:15 AM: -- Thanks for coming back and providing a way to reproduce the problem. I haven't run your code, but it looks like you are right. I'm not entirely sure how to fix this from the top of my head, someone would have to investigate this further. Also the problem might be a bit more profound and solution should take into an account operators spawning custom threads as well. was (Author: pnowojski): Thanks for coming back and providing a way to reproduce the problem. I haven't run your code, but it looks like you are right. I'm not entirely sure how to fix this from the top of my head, someone would have to investigate this further. > Should wait for the end of the source thread during the Task cancellation > - > > Key: FLINK-15467 > URL: https://issues.apache.org/jira/browse/FLINK-15467 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.9.0, 1.9.1, 1.10.1 >Reporter: ming li >Priority: Critical > Fix For: 1.12.0 > > > In the new mailBox model, SourceStreamTask starts a source thread to run > user methods, and the current execution thread will block on mailbox.takeMail > (). When a task cancels, the TaskCanceler thread will cancel the task and > interrupt the execution thread. Therefore, the execution thread of > SourceStreamTask will throw InterruptedException, then cancel the task again, > and throw an exception. > {code:java} > //代码占位符 > @Override > protected void performDefaultAction(ActionContext context) throws Exception { >// Against the usual contract of this method, this implementation is not > step-wise but blocking instead for >// compatibility reasons with the current source interface (source > functions run as a loop, not in steps). >sourceThread.start(); >// We run an alternative mailbox loop that does not involve default > actions and synchronizes around actions. >try { > runAlternativeMailboxLoop(); >} catch (Exception mailboxEx) { > // We cancel the source function if some runtime exception escaped the > mailbox. > if (!isCanceled()) { > cancelTask(); > } > throw mailboxEx; >} >sourceThread.join(); >if (!isFinished) { > sourceThread.checkThrowSourceExecutionException(); >} >context.allActionsCompleted(); > } > {code} > When all tasks of this TaskExecutor are canceled, the blob file will be > cleaned up. But the real source thread is not finished at this time, which > will cause a ClassNotFoundException when loading a new class. In this case, > the source thread may not be able to properly clean up and release resources > (such as closing child threads, cleaning up local files, etc.). Therefore, I > think we should mark this task canceled or finished after the execution of > the source thread is completed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17995) Redesign Table & SQL Connectors pages
[ https://issues.apache.org/jira/browse/FLINK-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-17995. --- Resolution: Fixed - master (1.12.0): ae8e0f5b370190df2058a8d5cd85e9aa47da1eeb - 1.11.0: 2c977bb19a967ea5abe38959e5d10c678b20b966 > Redesign Table & SQL Connectors pages > - > > Key: FLINK-17995 > URL: https://issues.apache.org/jira/browse/FLINK-17995 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0 > > > A lot of contents in > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#overview > is out-dated. There are also many frictions on the Descriptor API and YAML > file. I would propose to remove them in the new Overview page, we should > encourage users to use DDL for now. We can add them back once Descriptor API > and YAML API is ready again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434476802 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. +Overall Zeppelin supports 3 kinds of streaming data analytics: +* Single +* Update +* Append + +### Single Mode +Single mode is for the case when the result of sql statement is always one row, such as the following example. +The output format is HTML, and you can specify paragraph local property template for the final output content template. +And you can use {i} as placeholder for the ith column of result. + + + + + +### Update Mode +Update mode is suitable for the case when the output is more than one rows, +and always will be updated continuously. Here’s one example where we use group by. + + + + + +### Append Mode +Append mode is suitable for the scenario where output data is always appended. +E.g. the following example which use tumble window. Review comment: ```suggestion For instance, the example below uses a tumble window. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434476268 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. +Overall Zeppelin supports 3 kinds of streaming data analytics: +* Single +* Update +* Append + +### Single Mode +Single mode is for the case when the result of sql statement is always one row, such as the following example. +The output format is HTML, and you can specify paragraph local property template for the final output content template. +And you can use {i} as placeholder for the ith column of result. + + + + + +### Update Mode +Update mode is suitable for the case when the output is more than one rows, +and always will be updated continuously. Here’s one example where we use group by. + + + + + +### Append Mode +Append mode is suitable for the scenario where output data is always appended. Review comment: ```suggestion Append mode is suitable for the cases when the output data is always appended. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #344: [blog] flink on zeppelin - part2
MarkSfik commented on a change in pull request #344: URL: https://github.com/apache/flink-web/pull/344#discussion_r434476009 ## File path: _posts/2020-05-25-flink-on-zeppelin-part2.md ## @@ -0,0 +1,107 @@ +--- +layout: post +title: "Flink on Zeppelin Notebooks for Interactive Data Analysis - Part 2" +date: 2020-05-25T08:00:00.000Z +categories: ecosystem +authors: +- zjffdu: + name: "Jeff Zhang" + twitter: "zjffdu" +--- + +In the last post, I introduce the basics of Flink on Zeppelin and how to do Streaming ETL. This is part-2 where I would talk about how to +do streaming data visualization via Flink on Zeppelin and how to use flink UDF in Zeppelin. + +# Streaming Data Visualization + +In Zeppelin, you can build a realtime streaming dashboard without writing any line of javascript/html/css code. +Overall Zeppelin supports 3 kinds of streaming data analytics: +* Single +* Update +* Append + +### Single Mode +Single mode is for the case when the result of sql statement is always one row, such as the following example. +The output format is HTML, and you can specify paragraph local property template for the final output content template. +And you can use {i} as placeholder for the ith column of result. + + + + + +### Update Mode +Update mode is suitable for the case when the output is more than one rows, +and always will be updated continuously. Here’s one example where we use group by. Review comment: ```suggestion and will always be continuously updated. Here’s one example where we use group by. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-18079) KafkaShuffle Manual Tests
[ https://issues.apache.org/jira/browse/FLINK-18079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang reassigned FLINK-18079: Assignee: Yuan Mei > KafkaShuffle Manual Tests > - > > Key: FLINK-18079 > URL: https://issues.apache.org/jira/browse/FLINK-18079 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Yuan Mei >Assignee: Yuan Mei >Priority: Major > Labels: release-testing > Fix For: 1.11.0 > > > Manual Tests and Results to demonstrate KafkaShuffle working as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18079) KafkaShuffle Manual Tests
[ https://issues.apache.org/jira/browse/FLINK-18079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18079: - Parent: FLINK-18088 Issue Type: Sub-task (was: Improvement) > KafkaShuffle Manual Tests > - > > Key: FLINK-18079 > URL: https://issues.apache.org/jira/browse/FLINK-18079 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Runtime / Checkpointing >Affects Versions: 1.11.0 >Reporter: Yuan Mei >Priority: Major > Labels: release-testing > Fix For: 1.11.0 > > > Manual Tests and Results to demonstrate KafkaShuffle working as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17813) Manually test unaligned checkpoints on a cluster
[ https://issues.apache.org/jira/browse/FLINK-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-17813: - Labels: release-testing (was: ) > Manually test unaligned checkpoints on a cluster > > > Key: FLINK-17813 > URL: https://issues.apache.org/jira/browse/FLINK-17813 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Checkpointing, Runtime / Network >Reporter: Piotr Nowojski >Assignee: Roman Khachatryan >Priority: Blocker > Labels: release-testing > Fix For: 1.11.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18089) Add the zero-copy test into the azure E2E pipeline
[ https://issues.apache.org/jira/browse/FLINK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18089: - Parent: FLINK-18088 Issue Type: Sub-task (was: Test) > Add the zero-copy test into the azure E2E pipeline > -- > > Key: FLINK-18089 > URL: https://issues.apache.org/jira/browse/FLINK-18089 > Project: Flink > Issue Type: Sub-task >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Major > Labels: release-testing > Fix For: 1.11.0 > > > The zero-copy E2E test added in > [Flink-10742|https://issues.apache.org/jira/browse/FLINK-10742] is only added > to the deprecated travis pipeline previously. It should be added into the > Azure test pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16998) Add a changeflag to Row type
[ https://issues.apache.org/jira/browse/FLINK-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124885#comment-17124885 ] Timo Walther commented on FLINK-16998: -- Updating the `Row.toString` method will be postponed to 1.12. It would require to update roughly 600 tests. We should perform this change after we have dropped the legacy planner. I will create a follow up issue and mark this as closed for 1.11. > Add a changeflag to Row type > > > Key: FLINK-16998 > URL: https://issues.apache.org/jira/browse/FLINK-16998 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > > In Blink planner, the change flag of records travelling through the pipeline > are part of the record itself but not part of the logical schema. This > simplifies the architecture and API in many cases. > Which is why we aim adopt the same mechanism for > {{org.apache.flink.types.Row}}. > Take {{tableEnv.toRetractStream()}} as an example that returns either Scala > or Java {{Tuple2}}. For FLIP-95 we need to support more update > kinds than just a binary boolean. > This means: > - Add a changeflag {{RowKind}} to to {{Row}} > - Update the {{Row.toString()}} method > - Update serializers in backwards compatible way -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18088) Umbrella for testing features in release-1.11.0
[ https://issues.apache.org/jira/browse/FLINK-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18088: - Labels: release-testing (was: ) > Umbrella for testing features in release-1.11.0 > > > Key: FLINK-18088 > URL: https://issues.apache.org/jira/browse/FLINK-18088 > Project: Flink > Issue Type: Test >Affects Versions: 1.11.0 >Reporter: Zhijiang >Priority: Blocker > Labels: release-testing > Fix For: 1.11.0 > > > This is the umbrella issue for tracing the testing progress of all the > related features in release-1.11.0, either the way of e2e or manually testing > in cluster, to confirm the features work in practice with good quality. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-18089) Add the zero-copy test into the azure E2E pipeline
[ https://issues.apache.org/jira/browse/FLINK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang reassigned FLINK-18089: Assignee: Yun Gao > Add the zero-copy test into the azure E2E pipeline > -- > > Key: FLINK-18089 > URL: https://issues.apache.org/jira/browse/FLINK-18089 > Project: Flink > Issue Type: Test >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Major > Fix For: 1.11.0 > > > The zero-copy E2E test added in > [Flink-10742|https://issues.apache.org/jira/browse/FLINK-10742] is only added > to the deprecated travis pipeline previously. It should be added into the > Azure test pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18089) Add the zero-copy test into the azure E2E pipeline
[ https://issues.apache.org/jira/browse/FLINK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18089: - Labels: release-testing (was: ) > Add the zero-copy test into the azure E2E pipeline > -- > > Key: FLINK-18089 > URL: https://issues.apache.org/jira/browse/FLINK-18089 > Project: Flink > Issue Type: Test >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Major > Labels: release-testing > Fix For: 1.11.0 > > > The zero-copy E2E test added in > [Flink-10742|https://issues.apache.org/jira/browse/FLINK-10742] is only added > to the deprecated travis pipeline previously. It should be added into the > Azure test pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18087) Uploading user artifact for Yarn job cluster could not work
[ https://issues.apache.org/jira/browse/FLINK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-18087: --- Labels: pull-request-available (was: ) > Uploading user artifact for Yarn job cluster could not work > --- > > Key: FLINK-18087 > URL: https://issues.apache.org/jira/browse/FLINK-18087 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.0, 1.12.0 >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.11.0, 1.12.0 > > > In FLINK-17632, we add the support remote user jar. However, uploading user > artifact for Yarn job cluster is broken exceptionally. In the following code, > we should only upload local files. Now it has the contrary behavior. > {code:java} > // only upload local files > if (Utils.isRemotePath(entry.getValue().filePath)) { >Path localPath = new Path(entry.getValue().filePath); >Tuple2 remoteFileInfo = > fileUploader.uploadLocalFileToRemote(localPath, entry.getKey()); >jobGraph.setUserArtifactRemotePath(entry.getKey(), > remoteFileInfo.f0.toString()); > } > {code} > > Another problem is the related tests {{testPerJobModeWithDistributedCache}} > does not fail because we do not fetch the artifact from blob server. We > directly get it from local file. It also needs to be enhanced. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18089) Add the zero-copy test into the azure E2E pipeline
[ https://issues.apache.org/jira/browse/FLINK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18089: - Priority: Blocker (was: Major) > Add the zero-copy test into the azure E2E pipeline > -- > > Key: FLINK-18089 > URL: https://issues.apache.org/jira/browse/FLINK-18089 > Project: Flink > Issue Type: Sub-task >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Blocker > Labels: release-testing > Fix For: 1.11.0 > > > The zero-copy E2E test added in > [Flink-10742|https://issues.apache.org/jira/browse/FLINK-10742] is only added > to the deprecated travis pipeline previously. It should be added into the > Azure test pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-18089) Add the network zero-copy test into the azure E2E pipeline
[ https://issues.apache.org/jira/browse/FLINK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijiang updated FLINK-18089: - Summary: Add the network zero-copy test into the azure E2E pipeline (was: Add the zero-copy test into the azure E2E pipeline) > Add the network zero-copy test into the azure E2E pipeline > -- > > Key: FLINK-18089 > URL: https://issues.apache.org/jira/browse/FLINK-18089 > Project: Flink > Issue Type: Sub-task >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Blocker > Labels: release-testing > Fix For: 1.11.0 > > > The zero-copy E2E test added in > [Flink-10742|https://issues.apache.org/jira/browse/FLINK-10742] is only added > to the deprecated travis pipeline previously. It should be added into the > Azure test pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dawidwys commented on a change in pull request #12411: [FLINK-18005][table] Implement type inference for CAST
dawidwys commented on a change in pull request #12411: URL: https://github.com/apache/flink/pull/12411#discussion_r434513273 ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/BuiltInFunctionDefinitions.java ## @@ -919,7 +920,8 @@ new BuiltInFunctionDefinition.Builder() .name("cast") .kind(SCALAR) - .outputTypeStrategy(TypeStrategies.MISSING) + .inputTypeStrategy(SPECIFIC_FOR_CAST) + .outputTypeStrategy(nullable(ConstantArgumentCount.to(0), TypeStrategies.argument(1))) Review comment: After an offline discussion we said it's okay as it has very limited scope. The array approach has the downside that it does not support natively open ended ranges. ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/inference/strategies/CastInputTypeStrategy.java ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.types.inference.strategies; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.table.functions.BuiltInFunctionDefinitions; +import org.apache.flink.table.functions.FunctionDefinition; +import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.inference.ArgumentCount; +import org.apache.flink.table.types.inference.CallContext; +import org.apache.flink.table.types.inference.ConstantArgumentCount; +import org.apache.flink.table.types.inference.InputTypeStrategy; +import org.apache.flink.table.types.inference.Signature; +import org.apache.flink.table.types.logical.LegacyTypeInformationType; +import org.apache.flink.table.types.logical.LogicalType; + +import java.util.Collections; +import java.util.List; +import java.util.Optional; + +import static org.apache.flink.table.types.logical.utils.LogicalTypeCasts.supportsExplicitCast; + +/** + * {@link InputTypeStrategy} specific for {@link BuiltInFunctionDefinitions#CAST}. + * + * It expects two arguments where the type of first one must be castable to the type of the second + * one. The second one must be a type literal. + */ +@Internal +public final class CastInputTypeStrategy implements InputTypeStrategy { + + @Override + public ArgumentCount getArgumentCount() { + return ConstantArgumentCount.of(2); + } + + @Override + public Optional> inferInputTypes(CallContext callContext, boolean throwOnFailure) { + // check for type literal + if (!callContext.isArgumentLiteral(1) || !callContext.getArgumentValue(1, DataType.class).isPresent()) { + return Optional.empty(); Review comment: nit: throw an exception? ## File path: flink-table/flink-table-common/src/test/java/org/apache/flink/table/types/inference/TypeStrategiesTest.java ## @@ -113,37 +113,60 @@ .inputTypes() .expectErrorMessage("Could not infer an output type for the given arguments. Untyped NULL received."), - TestSpec.forStrategy( - "Infer a row type", - TypeStrategies.ROW) + TestSpec + .forStrategy( + "Infer a row type", + TypeStrategies.ROW) .inputTypes(DataTypes.BIGINT(), DataTypes.STRING()) .expectDataType(DataTypes.ROW( DataTypes.FIELD("f0", DataTypes.BIGINT()), DataTypes.FIELD("f1", DataTypes.STRING())).notNull() ), - TestSpec.forStrategy( - "Infer an array type", - TypeStrategies.ARRAY) + TestSpec + .forStrategy( + "Infer an array type", +
[GitHub] [flink] wuchong commented on a change in pull request #12445: [FLINK-18006] Always overwrite RestClientFactory in ElasticsearchXDynamicSink
wuchong commented on a change in pull request #12445: URL: https://github.com/apache/flink/pull/12445#discussion_r434364854 ## File path: flink-connectors/flink-connector-elasticsearch6/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/table/Elasticsearch6DynamicSink.java ## @@ -136,8 +136,7 @@ public SinkFunctionProvider getSinkRuntimeProvider(Context context) { config.getBulkFlushBackoffRetries().ifPresent(builder::setBulkFlushBackoffRetries); config.getBulkFlushBackoffDelay().ifPresent(builder::setBulkFlushBackoffDelay); - config.getPathPrefix() - .ifPresent(pathPrefix -> builder.setRestClientFactory(new DefaultRestClientFactory(pathPrefix))); + builder.setRestClientFactory(new DefaultRestClientFactory(config.getPathPrefix().orElse(null))); Review comment: Could you add a comment on this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-18068) Job scheduling stops but not exits after throwing non-fatal exception
[ https://issues.apache.org/jira/browse/FLINK-18068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124685#comment-17124685 ] Till Rohrmann edited comment on FLINK-18068 at 6/3/20, 7:27 AM: I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. I think the proper solution would be that per default we propagate all thrown exceptions. At places where user code exceptions can occur and if they are tolerable, the higher level code should catch it and handle it properly. The problem I see with this approach is that we are using at some places exceptions for normal control flow (e.g. when requesting a job which does not exist, then we throw a {{FlinkJobNotFoundException}}). I think this is a mistake and instead we should return a more meaningful return value instead (e.g. {{Optional}}). But this is a bigger effort to change. was (Author: till.rohrmann): I think the problem is rather in {{AkkaRpcActor}} at various places. There we catch {{Throwables}} and only propagate them in case of fatal errors or OOMs. I think it would be better to handle them consistently and let framework unchecked exceptions to bubble up. One thing which makes it a bit harder is that user code {{Throwables}} should not cause the system to fail. I think the proper solution would be that per default we propagate all thrown exceptions. At places where user code exceptions can occur and if they are tolerable, the higher level code should catch it and handle it properly. > Job scheduling stops but not exits after throwing non-fatal exception > - > > Key: FLINK-18068 > URL: https://issues.apache.org/jira/browse/FLINK-18068 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.10.1 >Reporter: Jiayi Liao >Priority: Major > > The batch job will stop but still be alive with doing nothing for a long time > (maybe forever?) if any non fatal exception is thrown from interacting with > YARN. Here is the example : > {code:java} > java.lang.IllegalStateException: The RMClient's and YarnResourceManagers > internal state about the number of pending container requests for resource > has diverged. Number client's pending container > requests 40 != Number RM's pending container requests 0. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:217) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.getPendingRequestsAndCheckConsistency(YarnResourceManager.java:518) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.onContainersOfResourceAllocated(YarnResourceManager.java:431) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:395) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) > ~[flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > [flink-ljy-1.0.jar:?] > at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > [flink-ljy-1.0.jar:?] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > [flink-ljy-1.0.jar:?] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > [flink-ljy-1.0.jar:?] > at
[GitHub] [flink] flinkbot edited a comment on pull request #12455: [FLINK-17935] Move setting yarn.log-config-file to YarnClusterClientFactory
flinkbot edited a comment on pull request #12455: URL: https://github.com/apache/flink/pull/12455#issuecomment-638007072 ## CI report: * bc3ca72ad94196d72570aa7a58e9aa5c411e2733 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2609) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zjffdu commented on a change in pull request #12346: [FLINK-17944][sql-client] Wrong output in sql client's table mode
zjffdu commented on a change in pull request #12346: URL: https://github.com/apache/flink/pull/12346#discussion_r434371197 ## File path: flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/result/MaterializedCollectStreamResult.java ## @@ -224,7 +224,7 @@ private void processDelete(Row row) { } for (int i = startSearchPos; i >= validRowPosition; i--) { - if (materializedTable.get(i).equals(row)) { + if (materializedTable.get(i).fieldsEquals(row)) { materializedTable.remove(i); rowPositionCache.remove(row); Review comment: @godfreyhe Is #12199 going to be merged in 1.11 ? For me, this is an api chagne, it would be better to stablize it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #12199: [FLINK-17774] [table] supports all kinds of changes for select result
flinkbot edited a comment on pull request #12199: URL: https://github.com/apache/flink/pull/12199#issuecomment-629793563 ## CI report: * 5b62118449cdf8d0de8d5b98781fdff9c2d0c571 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2057) * f07d136e61bd7024ccc58c9221f14d37ae7fb4b5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2608) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-15849) Update SQL-CLIENT document from type to data-type
[ https://issues.apache.org/jira/browse/FLINK-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-15849: --- Priority: Blocker (was: Major) > Update SQL-CLIENT document from type to data-type > - > > Key: FLINK-15849 > URL: https://issues.apache.org/jira/browse/FLINK-15849 > Project: Flink > Issue Type: Task > Components: Documentation, Table SQL / API >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Blocker > Fix For: 1.11.0, 1.10.2 > > > There are documentation of {{type}} instead of {{data-type}} in sql-client. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16375) Remove references to registerTableSource/Sink methods from documentation
[ https://issues.apache.org/jira/browse/FLINK-16375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-16375: --- Priority: Blocker (was: Major) > Remove references to registerTableSource/Sink methods from documentation > > > Key: FLINK-16375 > URL: https://issues.apache.org/jira/browse/FLINK-16375 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Dawid Wysakowicz >Priority: Blocker > Fix For: 1.11.0 > > > We should remove mentions of the registerTableSouce/Sink methods from > documentation and replace them with suggested approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18070) Time attribute been materialized after sub graph optimize
[ https://issues.apache.org/jira/browse/FLINK-18070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124735#comment-17124735 ] Jark Wu commented on FLINK-18070: - Yes, I think this is a bug in CommonSubGraphBasedOptimizer. > Time attribute been materialized after sub graph optimize > - > > Key: FLINK-18070 > URL: https://issues.apache.org/jira/browse/FLINK-18070 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.10.0 >Reporter: YufeiLiu >Priority: Major > Fix For: 1.11.0 > > > Hi, I want to use window aggregate after create temporary, and has multiple > sinks. But throw exception: > {code:java} > java.lang.AssertionError: type mismatch: > ref: > TIME ATTRIBUTE(PROCTIME) NOT NULL > input: > TIMESTAMP(3) NOT NULL > {code} > I look into the optimizer logic, there is comment at > {{CommonSubGraphBasedOptimizer}}: > "1. In general, for multi-sinks users tend to use VIEW which is a natural > common sub-graph." > After sub graph optimize, time attribute from source have been convert to > basic TIMESTAMP type according to {{FlinkRelTimeIndicatorProgram}}. But my > create view sql is simple query, I think didn't need to materialized time > attribute in theory. > Here is my code: > {code:java} > // connector.type COLLECTION is for debug use > tableEnv.sqlUpdate("CREATE TABLE source (\n" + > "`ts` AS PROCTIME(),\n" + > "`order_type` INT\n" + > ") WITH (\n" + > "'connector.type' = 'COLLECTION',\n" + > "'format.type' = 'json'\n" + > ")\n"); > tableEnv.createTemporaryView("source_view", tableEnv.sqlQuery("SELECT * FROM > source")); > tableEnv.sqlUpdate("CREATE TABLE sink (\n" + > "`result` BIGINT\n" + > ") WITH (\n" + > "'connector.type' = 'COLLECTION',\n" + > "'format.type' = 'json'\n" + > ")\n"); > tableEnv.sqlUpdate("INSERT INTO sink \n" + > "SELECT\n" + > "COUNT(1)\n" + > "FROM\n" + > "`source_view`\n" + > "WHERE\n" + > " `order_type` = 33\n" + > "GROUP BY\n" + > "TUMBLE(`ts`, INTERVAL '5' SECOND)\n"); > tableEnv.sqlUpdate("INSERT INTO sink \n" + > "SELECT\n" + > "COUNT(1)\n" + > "FROM\n" + > "`source_view`\n" + > "WHERE\n" + > " `order_type` = 34\n" + > "GROUP BY\n" + > "TUMBLE(`ts`, INTERVAL '5' SECOND)\n"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-17908) Vague document about Kafka config in SQL-CLI
[ https://issues.apache.org/jira/browse/FLINK-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young closed FLINK-17908. -- Resolution: Duplicate will be taken care by FLINK-17831 > Vague document about Kafka config in SQL-CLI > > > Key: FLINK-17908 > URL: https://issues.apache.org/jira/browse/FLINK-17908 > Project: Flink > Issue Type: Improvement > Components: Documentation, Table SQL / API >Affects Versions: 1.11.0 >Reporter: Shengkai Fang >Priority: Critical > Fix For: 1.11.0 > > > Currently Flink doesn't offer any default config value for Kafka and use the > deault config from Kafka. However, it uses the different config value when > describe how to use Kafka Connector in sql-client. Document of the connector > use value 'ealiest-offset' for 'connector.startup-mode', which is different > from Kafka's default behaviour. I think this vague document may mislead > users, especially for newbies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18074) Confirm checkpoint completed on task side would not fail the task if exception thrown out
[ https://issues.apache.org/jira/browse/FLINK-18074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124759#comment-17124759 ] Piotr Nowojski edited comment on FLINK-18074 at 6/3/20, 8:37 AM: - Yes, I think you are right. I've missed that {{notifyCheckpointCompleteAsync}} is also using {{submit}} (clearly we are missing a test coverage for that). I've assigned the ticket to you. was (Author: pnowojski): Yes, I think you are right. I've missed that {{notifyCheckpointCompleteAsync}} is also using {{submit}} (clearly we are missing a test coverage for that). > Confirm checkpoint completed on task side would not fail the task if > exception thrown out > - > > Key: FLINK-18074 > URL: https://issues.apache.org/jira/browse/FLINK-18074 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.11.0 > > > FLINK-17350 let the task fail immediately once sync phase of checkpoint > failed. However, the included commit ['Simplify checkpoint exception > handling'|https://github.com/apache/flink/pull/12101/commits/a2cd3daceca16ae841119d94a24328b4af37dcd8] > actually would not fail the task if the runnable of {{() -> > notifyCheckpointComplete}} throwing exception out. > In a nutshell, this actually changes previous checkpoint exception handling. > Moreover, that part of code also affect the implemented code of > {{notifyCheckpointAbortAsync}} when I introduce {{notifyCheckpointAborted}} > on task side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17101) [Umbrella] Supports dynamic table options for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-17101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Young updated FLINK-17101: --- Summary: [Umbrella] Supports dynamic table options for Flink SQL (was: Supports dynamic table options for Flink SQL) > [Umbrella] Supports dynamic table options for Flink SQL > --- > > Key: FLINK-17101 > URL: https://issues.apache.org/jira/browse/FLINK-17101 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.11.0 >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Fix For: 1.11.0 > > > Supports syntax: > {code:sql} > ... table /*+ OPTIONS('k1' = 'v1', 'k2' = 'v2') */ > {code} > to specify dynamic options within the scope of the appended table. The > dynamic options would override the static options defined in the CREATE TABLE > DDL or connector API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-18074) Confirm checkpoint completed on task side would not fail the task if exception thrown out
[ https://issues.apache.org/jira/browse/FLINK-18074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-18074: -- Assignee: Yun Tang > Confirm checkpoint completed on task side would not fail the task if > exception thrown out > - > > Key: FLINK-18074 > URL: https://issues.apache.org/jira/browse/FLINK-18074 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.11.0 > > > FLINK-17350 let the task fail immediately once sync phase of checkpoint > failed. However, the included commit ['Simplify checkpoint exception > handling'|https://github.com/apache/flink/pull/12101/commits/a2cd3daceca16ae841119d94a24328b4af37dcd8] > actually would not fail the task if the runnable of {{() -> > notifyCheckpointComplete}} throwing exception out. > In a nutshell, this actually changes previous checkpoint exception handling. > Moreover, that part of code also affect the implemented code of > {{notifyCheckpointAbortAsync}} when I introduce {{notifyCheckpointAborted}} > on task side. -- This message was sent by Atlassian Jira (v8.3.4#803005)