[GitHub] [flink] wuchong commented on a change in pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
wuchong commented on a change in pull request #14029: URL: https://github.com/apache/flink/pull/14029#discussion_r521176325 ## File path: flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushWatermarkIntoTableSourceScanRuleBase.java ## @@ -180,7 +180,10 @@ public DefaultWatermarkGenerator( @Override public void onEvent(RowData event, long eventTimestamp, WatermarkOutput output) { try { - currentWatermark = innerWatermarkGenerator.currentWatermark(event); + Long watermark = innerWatermarkGenerator.currentWatermark(event); + if (watermark != null) { + currentWatermark = innerWatermarkGenerator.currentWatermark(event); Review comment: Do not calculate it again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19630) Sink data in [ORC] format into Hive By using Legacy Table API caused unexpected Exception
[ https://issues.apache.org/jira/browse/FLINK-19630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229818#comment-17229818 ] Rui Li commented on FLINK-19630: Yeah I think documentation is all we can do at the moment. And BTW, instead of local patch, I think simply using the {{flink-sql-connector-hive-2.2.0}} uber jar as your dependency (download link can be found [here|https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#dependencies]) may also solve the problem. > Sink data in [ORC] format into Hive By using Legacy Table API caused > unexpected Exception > -- > > Key: FLINK-19630 > URL: https://issues.apache.org/jira/browse/FLINK-19630 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.11.2 >Reporter: Lsw_aka_laplace >Priority: Critical > Fix For: 1.12.0 > > Attachments: image-2020-10-14-11-36-48-086.png, > image-2020-10-14-11-41-53-379.png, image-2020-10-14-11-42-57-353.png, > image-2020-10-14-11-48-51-310.png > > > *ENV:* > *Flink version 1.11.2* > *Hive exec version: 2.0.1* > *Hive file storing type :ORC* > *SQL or Datastream: SQL API* > *Kafka Connector : custom Kafka connector which is based on Legacy API > (TableSource/`org.apache.flink.types.Row`)* > *Hive Connector : totally follows the Flink-Hive-connector (we only made some > encapsulation upon it)* > *Using StreamingFileCommitter:YES* > > > *Description:* > try to execute the following SQL: > """ > insert into hive_table (select * from kafka_table) > """ > HIVE Table SQL seems like: > """ > CREATE TABLE `hive_table`( > // some fields > PARTITIONED BY ( > `dt` string, > `hour` string) > STORED AS orc > TBLPROPERTIES ( > 'orc.compress'='SNAPPY', > 'type'='HIVE', > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.delay' = '1 h', > 'sink.partition-commit.policy.kind' = 'metastore,success-file', > ) > """ > When this job starts to process snapshot, here comes the weird exception: > !image-2020-10-14-11-36-48-086.png|width=882,height=395! > As we can see from the message:Owner thread shall be the [Legacy Source > Thread], but actually the streamTaskThread which represents the whole first > stage is found. > So I checked the Thread dump at once. > !image-2020-10-14-11-41-53-379.png|width=801,height=244! > The > legacy Source Thread > > !image-2020-10-14-11-42-57-353.png|width=846,height=226! > The StreamTask > Thread > > According to the thread dump info and the Exception Message, I searched > and read certain source code and then *{color:#ffab00}DID A TEST{color}* > > {color:#172b4d} Since the Kafka connector is customed, I tried to make the > KafkaSource a serpated stage by changing the TaskChainStrategy to Never. The > task topology as follows:{color} > {color:#172b4d}!image-2020-10-14-11-48-51-310.png|width=753,height=208!{color} > > {color:#505f79}*Fortunately, it did work! No Exception is throwed and > Checkpoint could be snapshot successfully!*{color} > > > So, from my perspective, there shall be something wrong when HiveWritingTask > and LegacySourceTask chained together. the Legacy source task is a seperated > thread, which may be the cause of the exception mentioned above. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
flinkbot edited a comment on pull request #14029: URL: https://github.com/apache/flink/pull/14029#issuecomment-725258730 ## CI report: * a128055b2b0d97ce6673363c7885c97b17ba6e07 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9447) * 820895764bfe9e83648c470a53f19fe232ce6518 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-15906) physical memory exceeded causing being killed by yarn
[ https://issues.apache.org/jira/browse/FLINK-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229817#comment-17229817 ] yang gang commented on FLINK-15906: --- {code:java} Closing TaskExecutor connection container_1597847003686_0079_01_000121. Because: Container [pid=4269,containerID=container_1597847003686_0079_01_000121] is running beyond physical memory limits. Current usage: 20.0 GB of 20 GB physical memory used; 24.9 GB of 100 GB virtual memory used. Killing container. Dump of the process-tree for container_1597847003686_0079_01_000121 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 4298 4269 4269 4269 (java) 104835705 33430931 26634625024 5242644 /usr/local/jdk1.8/bin/java -Xmx10871635848 -Xms10871635848 -XX:MaxDirectMemorySize=1207959552 -XX:MaxMetaspaceSize=268435456 -server -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:ParallelGCThreads=4 -XX:+AlwaysPreTouch -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=3 -DjobName=ck_local_growthline_new-10 -Dlog.file=/data2/yarn/containers/application_1597847003686_0079/container_1597847003686_0079_01_000121/taskmanager.log -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskExecutorRunner -D taskmanager.memory.framework.off-heap.size=134217728b -D taskmanager.memory.network.max=1073741824b -D taskmanager.memory.network.min=1073741824b -D taskmanager.memory.framework.heap.size=134217728b -D taskmanager.memory.managed.size=8053063800b -D taskmanager.cpu.cores=10.0 -D taskmanager.memory.task.heap.size=10737418120b -D taskmanager.memory.task.off-heap.size=0b --configDir . -Djobmanager.rpc.address={address} -Dweb.port=0 -Dweb.tmpdir=/tmp/flink-web-0874be2a-720d-443c-a069-0bb1fad69433 -Djobmanager.rpc.port=36047 -Drest.address={address} |- 4269 4267 4269 4269 (bash) 0 0 115904512 359 /bin/bash -c /usr/local/jdk1.8/bin/java -Xmx10871635848 -Xms10871635848 -XX:MaxDirectMemorySize=1207959552 -XX:MaxMetaspaceSize=268435456 -server -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:ParallelGCThreads=4 -XX:+AlwaysPreTouch -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=3 -DjobName=ck_local_growthline_new-10 -Dlog.file=/data2/yarn/containers/application_1597847003686_0079/container_1597847003686_0079_01_000121/taskmanager.log -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskExecutorRunner -D taskmanager.memory.framework.off-heap.size=134217728b -D taskmanager.memory.network.max=1073741824b -D taskmanager.memory.network.min=1073741824b -D taskmanager.memory.framework.heap.size=134217728b -D taskmanager.memory.managed.size=8053063800b -D taskmanager.cpu.cores=10.0 -D taskmanager.memory.task.heap.size=10737418120b -D taskmanager.memory.task.off-heap.size=0b --configDir . -Djobmanager.rpc.address={address} -Dweb.port='0' -Dweb.tmpdir='/tmp/flink-web-0874be2a-720d-443c-a069-0bb1fad69433' -Djobmanager.rpc.port='36047' -Drest.address={address} 1> /data2/yarn/containers/application_1597847003686_0079/container_1597847003686_0079_01_000121/taskmanager.out 2> /data2/yarn/containers/application_1597847003686_0079/container_1597847003686_0079_01_000121/taskmanager.err {code} [~xintongsong] I have also encountered this kind of problem. This is a task of calculating DAU indicators. But this exception does not happen frequently. I have observed the memory metrics and logs of this task, but have not found useful information, so I would like to ask you how to solve this problem? > physical memory exceeded causing being killed by yarn > - > > Key: FLINK-15906 > URL: https://issues.apache.org/jira/browse/FLINK-15906 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: liupengcheng >Priority: Major > > Recently, we encoutered this issue when testing TPCDS query with 100g data. > I first meet this issue when I only set the > `taskmanager.memory.total-process.size` to `4g` with `-tm` option. Then I try > to increase the jvmOverhead size with following arguments, but still failed. > {code:java} > taskmanager.memory.jvm-overhead.min: 640m > taskmanager.memory.jvm-metaspace: 128m > taskmanager.memory.task.heap.size: 1408m > taskmanager.memory.framework.heap.size: 128m > taskmanager.memory.framework.off-heap.size: 128m > taskmanager.memory.managed.size: 1408m > taskmanager.memory.shuffle.max: 256m > {code} > {code:java} > java.lang.Exception: [2020-02-05 11:31:32.345]Container > [pid=101677,containerID=container_e08_1578903621081_4785_01_51] is > running 46342144B beyond the 'PHYSICAL' memory limit. Current usage: 4.04 GB > of 4 GB
[GitHub] [flink] Shawn-Hx commented on pull request #13664: [FLINK-19673] Translate "Standalone Cluster" page into Chinese
Shawn-Hx commented on pull request #13664: URL: https://github.com/apache/flink/pull/13664#issuecomment-725266494 @klion26 Thanks for your reminder. I have resolved the conflict. PTAL. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on pull request #14027: [FLINK-20074][table-planner-blink] Fix can't generate plan when joining on changelog source without updates
wuchong commented on pull request #14027: URL: https://github.com/apache/flink/pull/14027#issuecomment-725265512 The changes on `JoinTest.xml` is a fix of UpdateKindTrait inference of Join operator. The streaming join operator just forward all the input records and can't drop UPDATE_BEFORE messages. Therefore, the plan is wrong for a long time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19303) Disable WAL in RocksDB recovery
[ https://issues.apache.org/jira/browse/FLINK-19303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229815#comment-17229815 ] Juha Mynttinen commented on FLINK-19303: It's possible that a later RocksDB version (>5.17.2) has fixes related to this issue. Thus it makes sense to wait for the RocksDB bump before working on this issue. In FLINK-14482 there's an effort to bump RocksDB to 6.x.y. > Disable WAL in RocksDB recovery > --- > > Key: FLINK-19303 > URL: https://issues.apache.org/jira/browse/FLINK-19303 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Juha Mynttinen >Assignee: Juha Mynttinen >Priority: Major > > During recovery of {{RocksDBStateBackend}} the recovery mechanism puts the > key value pairs to local RocksDB instance(s). To speed up the process, the > recovery process uses RocskDB write batch mechanism. [RocksDB > WAL|https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log] is enabled > during this process. > During normal operations, i.e. when the state backend has been recovered and > the Flink application is running (on RocksDB state backend) WAL is disabled. > The recovery process doesn't need WAL. In fact the recovery should be much > faster without WAL. Thus, WAL should be disabled in the recovery process. > AFAIK the last thing that was done with WAL during recovery was an attempt to > remove it. Later that removal was removed because it causes stability issues > (https://issues.apache.org/jira/browse/FLINK-8922). > Unfortunately the root cause why disabling WAL causes segfault during > recovery is unknown. After all, WAL is not used during normal operations. > Potential explanation is some kind of bug in RocksDB write batch when using > WAL. It is possible later RocksDB versions have fixes / workarounds for the > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14482) Bump up rocksdb version
[ https://issues.apache.org/jira/browse/FLINK-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229814#comment-17229814 ] Juha Mynttinen commented on FLINK-14482: Hey, I'm trying to understand this JIRA. The RocksDB state backend that uses managed memory does use RocksDB write buffer manager (since Flink 1.10). The description of this JIRA is "Current rocksDB-5.17.2 does not support write buffer manager well". What's the actual issue is 5.17.2 with write buffer manager? I cabn't find more detailed explanation. > Bump up rocksdb version > --- > > Key: FLINK-14482 > URL: https://issues.apache.org/jira/browse/FLINK-14482 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > Current rocksDB-5.17.2 does not support write buffer manager well, we need to > bump rocksdb version to support that feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20084) Fix NPE when generating watermark for record whose rowtime field is null after watermark push down
[ https://issues.apache.org/jira/browse/FLINK-20084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-20084: --- Assignee: Shengkai Fang > Fix NPE when generating watermark for record whose rowtime field is null > after watermark push down > -- > > Key: FLINK-20084 > URL: https://issues.apache.org/jira/browse/FLINK-20084 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Shengkai Fang >Assignee: Shengkai Fang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > The problem is from the class > {{PushWatermarkIntoTableSourceScanRuleBase$DefaultWatermarkGeneratorSupplier$DefaultWatermarkGenerator#onEvent}}. > It doesn't test whether the calculated watermark is null before set. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20085) Remove RemoteFunctionStateMigrator code paths from StateFun
[ https://issues.apache.org/jira/browse/FLINK-20085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-20085: Issue Type: Task (was: Bug) > Remove RemoteFunctionStateMigrator code paths from StateFun > > > Key: FLINK-20085 > URL: https://issues.apache.org/jira/browse/FLINK-20085 > Project: Flink > Issue Type: Task > Components: Stateful Functions >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: statefun-2.3.0 > > > The {{RemoteFunctionStateMigrator}} was added to allow savepoints with > versions <= 2.1.0 to have a migration path for upgrading to versions >= > 2.2.0. The binary format of remote function state was changed due to > demultiplexed remote state, introduced in 2.2.0. > With 2.2.0 already released with the new formats, it is now safe to fully > remove this migration path. > For users, what this means that it would not be possible to directly upgrade > from 2.0.x / 2.1.x to 2.3.x+. They'd have to perform incremental upgrades via > 2.2.x, by restoring first with 2.2.x and then taking another savepoint, > before upgrading to 2.3.x. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20085) Remove RemoteFunctionStateMigrator code paths from StateFun
Tzu-Li (Gordon) Tai created FLINK-20085: --- Summary: Remove RemoteFunctionStateMigrator code paths from StateFun Key: FLINK-20085 URL: https://issues.apache.org/jira/browse/FLINK-20085 Project: Flink Issue Type: Bug Components: Stateful Functions Reporter: Tzu-Li (Gordon) Tai Assignee: Tzu-Li (Gordon) Tai Fix For: statefun-2.3.0 The {{RemoteFunctionStateMigrator}} was added to allow savepoints with versions <= 2.1.0 to have a migration path for upgrading to versions >= 2.2.0. The binary format of remote function state was changed due to demultiplexed remote state, introduced in 2.2.0. With 2.2.0 already released with the new formats, it is now safe to fully remove this migration path. For users, what this means that it would not be possible to directly upgrade from 2.0.x / 2.1.x to 2.3.x+. They'd have to perform incremental upgrades via 2.2.x, by restoring first with 2.2.x and then taking another savepoint, before upgrading to 2.3.x. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19979) Sanity check after bash e2e tests for no leftover processes
[ https://issues.apache.org/jira/browse/FLINK-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229811#comment-17229811 ] Robert Metzger commented on FLINK-19979: This is making good progress. I can probably open a PR in the next 24 hours. > Sanity check after bash e2e tests for no leftover processes > --- > > Key: FLINK-19979 > URL: https://issues.apache.org/jira/browse/FLINK-19979 > Project: Flink > Issue Type: New Feature > Components: Test Infrastructure >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Critical > Fix For: 1.12.0 > > > As seen in FLINK-19974, if an e2e test is not cleaning up properly, other e2e > tests might fail with difficult to diagnose issues. > I propose to check that no leftover processes (including docker containers) > are running after each bash e2e test. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] klion26 commented on pull request #13664: [FLINK-19673] Translate "Standalone Cluster" page into Chinese
klion26 commented on pull request #13664: URL: https://github.com/apache/flink/pull/13664#issuecomment-725259657 @Shawn-Hx when I was to merge this pr, found that there is some conflict, could you please resolve the conflict, so that I can merge this. thanks and sorry for the inconvenience This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
flinkbot commented on pull request #14029: URL: https://github.com/apache/flink/pull/14029#issuecomment-725258730 ## CI report: * a128055b2b0d97ce6673363c7885c97b17ba6e07 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-16947) ArtifactResolutionException: Could not transfer artifact. Entry [...] has not been leased from this pool
[ https://issues.apache.org/jira/browse/FLINK-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-16947: -- Assignee: (was: Robert Metzger) > ArtifactResolutionException: Could not transfer artifact. Entry [...] has > not been leased from this pool > - > > Key: FLINK-16947 > URL: https://issues.apache.org/jira/browse/FLINK-16947 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Reporter: Piotr Nowojski >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6982=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5 > Build of flink-metrics-availability-test failed with: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (end-to-end-tests) > on project flink-metrics-availability-test: Unable to generate classpath: > org.apache.maven.artifact.resolver.ArtifactResolutionException: Could not > transfer artifact org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > from/to google-maven-central > (https://maven-central-eu.storage-download.googleapis.com/maven2/): Entry > [id:13][route:{s}->https://maven-central-eu.storage-download.googleapis.com:443][state:null] > has not been leased from this pool > [ERROR] org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > [ERROR] > [ERROR] from the specified remote repositories: > [ERROR] google-maven-central > (https://maven-central-eu.storage-download.googleapis.com/maven2/, > releases=true, snapshots=false), > [ERROR] apache.snapshots (https://repository.apache.org/snapshots, > releases=false, snapshots=true) > [ERROR] Path to dependency: > [ERROR] 1) dummy:dummy:jar:1.0 > [ERROR] 2) org.apache.maven.surefire:surefire-junit47:jar:2.22.1 > [ERROR] 3) org.apache.maven.surefire:common-junit48:jar:2.22.1 > [ERROR] 4) org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > [ERROR] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :flink-metrics-availability-test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20035) BlockingShuffleITCase unstable with "Could not start rest endpoint on any port in port range 8081"
[ https://issues.apache.org/jira/browse/FLINK-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger closed FLINK-20035. -- Resolution: Fixed > BlockingShuffleITCase unstable with "Could not start rest endpoint on any > port in port range 8081" > -- > > Key: FLINK-20035 > URL: https://issues.apache.org/jira/browse/FLINK-20035 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Yingjie Cao >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9178=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0 > {code} > 2020-11-06T13:52:56.6369221Z [ERROR] > testBoundedBlockingShuffle(org.apache.flink.test.runtime.BlockingShuffleITCase) > Time elapsed: 3.522 s <<< ERROR! > 2020-11-06T13:52:56.6370005Z org.apache.flink.util.FlinkException: Could not > create the DispatcherResourceManagerComponent. > 2020-11-06T13:52:56.6370649Z at > org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:257) > 2020-11-06T13:52:56.6371371Z at > org.apache.flink.runtime.minicluster.MiniCluster.createDispatcherResourceManagerComponents(MiniCluster.java:412) > 2020-11-06T13:52:56.6372258Z at > org.apache.flink.runtime.minicluster.MiniCluster.setupDispatcherResourceManagerComponents(MiniCluster.java:378) > 2020-11-06T13:52:56.6373276Z at > org.apache.flink.runtime.minicluster.MiniCluster.start(MiniCluster.java:334) > 2020-11-06T13:52:56.6374182Z at > org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:50) > 2020-11-06T13:52:56.6375055Z at > org.apache.flink.test.runtime.BlockingShuffleITCase.testBoundedBlockingShuffle(BlockingShuffleITCase.java:53) > 2020-11-06T13:52:56.6375787Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-11-06T13:52:56.6376546Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-11-06T13:52:56.6377514Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-11-06T13:52:56.6378008Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-11-06T13:52:56.6378774Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-11-06T13:52:56.6379350Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-11-06T13:52:56.6458094Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-11-06T13:52:56.6459047Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-11-06T13:52:56.6459678Z at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > 2020-11-06T13:52:56.6460182Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > 2020-11-06T13:52:56.6460770Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > 2020-11-06T13:52:56.6461210Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-06T13:52:56.6461649Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-06T13:52:56.6462089Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-06T13:52:56.6462736Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-06T13:52:56.6463286Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-11-06T13:52:56.6463728Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-11-06T13:52:56.6464344Z at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > 2020-11-06T13:52:56.6464918Z at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > 2020-11-06T13:52:56.6465428Z at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > 2020-11-06T13:52:56.6465915Z at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > 2020-11-06T13:52:56.6466405Z at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > 2020-11-06T13:52:56.6467050Z at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > 2020-11-06T13:52:56.6468341Z at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > 2020-11-06T13:52:56.6468794Z at >
[jira] [Commented] (FLINK-20035) BlockingShuffleITCase unstable with "Could not start rest endpoint on any port in port range 8081"
[ https://issues.apache.org/jira/browse/FLINK-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229808#comment-17229808 ] Robert Metzger commented on FLINK-20035: Fixed in https://github.com/apache/flink/commit/0c36c842c3cb0c3aa75e7e94c56dce217ee548de. > BlockingShuffleITCase unstable with "Could not start rest endpoint on any > port in port range 8081" > -- > > Key: FLINK-20035 > URL: https://issues.apache.org/jira/browse/FLINK-20035 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Yingjie Cao >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9178=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0 > {code} > 2020-11-06T13:52:56.6369221Z [ERROR] > testBoundedBlockingShuffle(org.apache.flink.test.runtime.BlockingShuffleITCase) > Time elapsed: 3.522 s <<< ERROR! > 2020-11-06T13:52:56.6370005Z org.apache.flink.util.FlinkException: Could not > create the DispatcherResourceManagerComponent. > 2020-11-06T13:52:56.6370649Z at > org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:257) > 2020-11-06T13:52:56.6371371Z at > org.apache.flink.runtime.minicluster.MiniCluster.createDispatcherResourceManagerComponents(MiniCluster.java:412) > 2020-11-06T13:52:56.6372258Z at > org.apache.flink.runtime.minicluster.MiniCluster.setupDispatcherResourceManagerComponents(MiniCluster.java:378) > 2020-11-06T13:52:56.6373276Z at > org.apache.flink.runtime.minicluster.MiniCluster.start(MiniCluster.java:334) > 2020-11-06T13:52:56.6374182Z at > org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:50) > 2020-11-06T13:52:56.6375055Z at > org.apache.flink.test.runtime.BlockingShuffleITCase.testBoundedBlockingShuffle(BlockingShuffleITCase.java:53) > 2020-11-06T13:52:56.6375787Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-11-06T13:52:56.6376546Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-11-06T13:52:56.6377514Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-11-06T13:52:56.6378008Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-11-06T13:52:56.6378774Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-11-06T13:52:56.6379350Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-11-06T13:52:56.6458094Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-11-06T13:52:56.6459047Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-11-06T13:52:56.6459678Z at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > 2020-11-06T13:52:56.6460182Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > 2020-11-06T13:52:56.6460770Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > 2020-11-06T13:52:56.6461210Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-06T13:52:56.6461649Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-06T13:52:56.6462089Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-06T13:52:56.6462736Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-06T13:52:56.6463286Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-11-06T13:52:56.6463728Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-11-06T13:52:56.6464344Z at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > 2020-11-06T13:52:56.6464918Z at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > 2020-11-06T13:52:56.6465428Z at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > 2020-11-06T13:52:56.6465915Z at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > 2020-11-06T13:52:56.6466405Z at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > 2020-11-06T13:52:56.6467050Z at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > 2020-11-06T13:52:56.6468341Z at >
[GitHub] [flink] rmetzger merged pull request #13992: [FLINK-20035][tests] Use random port for rest endpoint in BlockingShuffleITCase and ShuffleCompressionITCase
rmetzger merged pull request #13992: URL: https://github.com/apache/flink/pull/13992 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on pull request #13992: [FLINK-20035][tests] Use random port for rest endpoint in BlockingShuffleITCase and ShuffleCompressionITCase
rmetzger commented on pull request #13992: URL: https://github.com/apache/flink/pull/13992#issuecomment-725258225 Merging ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] godfreyhe commented on a change in pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
godfreyhe commented on a change in pull request #14029: URL: https://github.com/apache/flink/pull/14029#discussion_r521164123 ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/sql/SourceWatermarkITCase.scala ## @@ -39,7 +39,7 @@ class SourceWatermarkITCase extends StreamingTestBase{ def testWatermarkWithNestedRow(): Unit = { val data = Seq( row(1, 2L, row("i1", row("i2", LocalDateTime.parse("2020-11-21T19:00:05.23", - row(2, 3L, row("j1", row("j2", LocalDateTime.parse("2020-11-21T21:00:05.23" + row(2, 3L, row("j1", row("j2", null))) Review comment: instead of replacing the existing row, I tend to add a new row This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-16947) ArtifactResolutionException: Could not transfer artifact. Entry [...] has not been leased from this pool
[ https://issues.apache.org/jira/browse/FLINK-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229806#comment-17229806 ] Robert Metzger commented on FLINK-16947: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9437=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529 > ArtifactResolutionException: Could not transfer artifact. Entry [...] has > not been leased from this pool > - > > Key: FLINK-16947 > URL: https://issues.apache.org/jira/browse/FLINK-16947 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Reporter: Piotr Nowojski >Assignee: Robert Metzger >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6982=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5 > Build of flink-metrics-availability-test failed with: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (end-to-end-tests) > on project flink-metrics-availability-test: Unable to generate classpath: > org.apache.maven.artifact.resolver.ArtifactResolutionException: Could not > transfer artifact org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > from/to google-maven-central > (https://maven-central-eu.storage-download.googleapis.com/maven2/): Entry > [id:13][route:{s}->https://maven-central-eu.storage-download.googleapis.com:443][state:null] > has not been leased from this pool > [ERROR] org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > [ERROR] > [ERROR] from the specified remote repositories: > [ERROR] google-maven-central > (https://maven-central-eu.storage-download.googleapis.com/maven2/, > releases=true, snapshots=false), > [ERROR] apache.snapshots (https://repository.apache.org/snapshots, > releases=false, snapshots=true) > [ERROR] Path to dependency: > [ERROR] 1) dummy:dummy:jar:1.0 > [ERROR] 2) org.apache.maven.surefire:surefire-junit47:jar:2.22.1 > [ERROR] 3) org.apache.maven.surefire:common-junit48:jar:2.22.1 > [ERROR] 4) org.apache.maven.surefire:surefire-grouper:jar:2.22.1 > [ERROR] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :flink-metrics-availability-test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
flinkbot commented on pull request #14029: URL: https://github.com/apache/flink/pull/14029#issuecomment-725254026 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit a128055b2b0d97ce6673363c7885c97b17ba6e07 (Wed Nov 11 07:21:54 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-20084).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19585) UnalignedCheckpointCompatibilityITCase.test:97->runAndTakeSavepoint: "Not all required tasks are currently running."
[ https://issues.apache.org/jira/browse/FLINK-19585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229805#comment-17229805 ] Arvid Heise commented on FLINK-19585: - The last two reports of master are FLINK-20065 (AskTimeout). I guess the reopened one is more about a backport to 1.11. I'm waiting for the other ticket to be resolved though as it may be related to the fix of this ticket. > UnalignedCheckpointCompatibilityITCase.test:97->runAndTakeSavepoint: "Not all > required tasks are currently running." > > > Key: FLINK-19585 > URL: https://issues.apache.org/jira/browse/FLINK-19585 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Arvid Heise >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7419=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0 > {code} > 2020-10-12T10:27:51.7667213Z [ERROR] Tests run: 4, Failures: 0, Errors: 1, > Skipped: 0, Time elapsed: 13.146 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase > 2020-10-12T10:27:51.7675454Z [ERROR] test[type: SAVEPOINT, startAligned: > false](org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase) > Time elapsed: 2.168 s <<< ERROR! > 2020-10-12T10:27:51.7676759Z java.util.concurrent.ExecutionException: > java.util.concurrent.CompletionException: > org.apache.flink.runtime.checkpoint.CheckpointException: Not all required > tasks are currently running. > 2020-10-12T10:27:51.7686572Z at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 2020-10-12T10:27:51.7688239Z at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > 2020-10-12T10:27:51.7689543Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.runAndTakeSavepoint(UnalignedCheckpointCompatibilityITCase.java:113) > 2020-10-12T10:27:51.7690681Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.test(UnalignedCheckpointCompatibilityITCase.java:97) > 2020-10-12T10:27:51.7691513Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-10-12T10:27:51.7692182Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-10-12T10:27:51.7692964Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-10-12T10:27:51.7693655Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-10-12T10:27:51.7694489Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-10-12T10:27:51.7707103Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-10-12T10:27:51.7729199Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-10-12T10:27:51.7730097Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-10-12T10:27:51.7730833Z at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > 2020-10-12T10:27:51.7731500Z at > org.junit.rules.RunRules.evaluate(RunRules.java:20) > 2020-10-12T10:27:51.7732086Z at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > 2020-10-12T10:27:51.7732781Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > 2020-10-12T10:27:51.7733563Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > 2020-10-12T10:27:51.7734735Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-10-12T10:27:51.7735400Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-10-12T10:27:51.7736075Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-10-12T10:27:51.7736757Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-10-12T10:27:51.7737432Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-10-12T10:27:51.7738081Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-10-12T10:27:51.7739008Z at > org.junit.runners.Suite.runChild(Suite.java:128) > 2020-10-12T10:27:51.7739583Z at > org.junit.runners.Suite.runChild(Suite.java:27) > 2020-10-12T10:27:51.7740173Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-10-12T10:27:51.7740800Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-10-12T10:27:51.7741470Z at >
[jira] [Updated] (FLINK-20084) Fix NPE when generating watermark for record whose rowtime field is null after watermark push down
[ https://issues.apache.org/jira/browse/FLINK-20084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-20084: --- Labels: pull-request-available (was: ) > Fix NPE when generating watermark for record whose rowtime field is null > after watermark push down > -- > > Key: FLINK-20084 > URL: https://issues.apache.org/jira/browse/FLINK-20084 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Shengkai Fang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > The problem is from the class > {{PushWatermarkIntoTableSourceScanRuleBase$DefaultWatermarkGeneratorSupplier$DefaultWatermarkGenerator#onEvent}}. > It doesn't test whether the calculated watermark is null before set. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] fsk119 opened a new pull request #14029: [FLINK-20084][planner] Fix NPE when generating watermark for record w…
fsk119 opened a new pull request #14029: URL: https://github.com/apache/flink/pull/14029 …hose rowtime field is null after watermark push down ## What is the purpose of the change *Fix a bug.* ## Verifying this change This change added tests and can be verified as follows: - add IT Case. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20068) KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
[ https://issues.apache.org/jira/browse/FLINK-20068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229803#comment-17229803 ] Robert Metzger commented on FLINK-20068: Thanks a lot for the explanation and fix! > KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results > - > > Key: FLINK-20068 > URL: https://issues.apache.org/jira/browse/FLINK-20068 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.0 >Reporter: Dian Fu >Assignee: Jiangjie Qin >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0, 1.11.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9365=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 > {code} > 2020-11-10T00:14:22.7658242Z [ERROR] > testTopicPatternSubscriber(org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest) > Time elapsed: 0.012 s <<< FAILURE! > 2020-11-10T00:14:22.7659838Z java.lang.AssertionError: > expected:<[pattern-topic-5, pattern-topic-4, pattern-topic-7, > pattern-topic-6, pattern-topic-9, pattern-topic-8, pattern-topic-1, > pattern-topic-0, pattern-topic-3]> but was:<[]> > 2020-11-10T00:14:22.7660740Z at org.junit.Assert.fail(Assert.java:88) > 2020-11-10T00:14:22.7661245Z at > org.junit.Assert.failNotEquals(Assert.java:834) > 2020-11-10T00:14:22.7661788Z at > org.junit.Assert.assertEquals(Assert.java:118) > 2020-11-10T00:14:22.7662312Z at > org.junit.Assert.assertEquals(Assert.java:144) > 2020-11-10T00:14:22.7663051Z at > org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest.testTopicPatternSubscriber(KafkaSubscriberTest.java:94) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20068) KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
[ https://issues.apache.org/jira/browse/FLINK-20068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin reassigned FLINK-20068: Assignee: Jiangjie Qin > KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results > - > > Key: FLINK-20068 > URL: https://issues.apache.org/jira/browse/FLINK-20068 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.0 >Reporter: Dian Fu >Assignee: Jiangjie Qin >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0, 1.11.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9365=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 > {code} > 2020-11-10T00:14:22.7658242Z [ERROR] > testTopicPatternSubscriber(org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest) > Time elapsed: 0.012 s <<< FAILURE! > 2020-11-10T00:14:22.7659838Z java.lang.AssertionError: > expected:<[pattern-topic-5, pattern-topic-4, pattern-topic-7, > pattern-topic-6, pattern-topic-9, pattern-topic-8, pattern-topic-1, > pattern-topic-0, pattern-topic-3]> but was:<[]> > 2020-11-10T00:14:22.7660740Z at org.junit.Assert.fail(Assert.java:88) > 2020-11-10T00:14:22.7661245Z at > org.junit.Assert.failNotEquals(Assert.java:834) > 2020-11-10T00:14:22.7661788Z at > org.junit.Assert.assertEquals(Assert.java:118) > 2020-11-10T00:14:22.7662312Z at > org.junit.Assert.assertEquals(Assert.java:144) > 2020-11-10T00:14:22.7663051Z at > org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest.testTopicPatternSubscriber(KafkaSubscriberTest.java:94) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-20068) KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
[ https://issues.apache.org/jira/browse/FLINK-20068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin resolved FLINK-20068. -- Fix Version/s: 1.11.3 Resolution: Fixed Merged to master. cacb4c1fb1d6123d0aeb93d550ffcafa6ad4a8fb Cherry-picked to 1.11: 18e4c1ba5b66c704b3ea7e5ba8fb7f3499207903 > KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results > - > > Key: FLINK-20068 > URL: https://issues.apache.org/jira/browse/FLINK-20068 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.0 >Reporter: Dian Fu >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0, 1.11.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9365=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 > {code} > 2020-11-10T00:14:22.7658242Z [ERROR] > testTopicPatternSubscriber(org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest) > Time elapsed: 0.012 s <<< FAILURE! > 2020-11-10T00:14:22.7659838Z java.lang.AssertionError: > expected:<[pattern-topic-5, pattern-topic-4, pattern-topic-7, > pattern-topic-6, pattern-topic-9, pattern-topic-8, pattern-topic-1, > pattern-topic-0, pattern-topic-3]> but was:<[]> > 2020-11-10T00:14:22.7660740Z at org.junit.Assert.fail(Assert.java:88) > 2020-11-10T00:14:22.7661245Z at > org.junit.Assert.failNotEquals(Assert.java:834) > 2020-11-10T00:14:22.7661788Z at > org.junit.Assert.assertEquals(Assert.java:118) > 2020-11-10T00:14:22.7662312Z at > org.junit.Assert.assertEquals(Assert.java:144) > 2020-11-10T00:14:22.7663051Z at > org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest.testTopicPatternSubscriber(KafkaSubscriberTest.java:94) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19964) Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph
[ https://issues.apache.org/jira/browse/FLINK-19964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229801#comment-17229801 ] Robert Metzger commented on FLINK-19964: Thanks a lot for the fix! > Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph > > > Key: FLINK-19964 > URL: https://issues.apache.org/jira/browse/FLINK-19964 > Project: Flink > Issue Type: Bug > Components: Library / Graph Processing (Gelly), Runtime / Network, > Tests >Affects Versions: 1.12.0 >Reporter: Chesnay Schepler >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > The HITSITCase has gotten stuck on Azure. Chances are that something in the > scheduling or network has broken it. > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8919=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] becketqin closed pull request #14022: [FLINK-20068] Enhance the topic creation guarantee to ensure all the …
becketqin closed pull request #14022: URL: https://github.com/apache/flink/pull/14022 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] becketqin commented on pull request #14022: [FLINK-20068] Enhance the topic creation guarantee to ensure all the …
becketqin commented on pull request #14022: URL: https://github.com/apache/flink/pull/14022#issuecomment-725251041 Merged to master. cacb4c1fb1d6123d0aeb93d550ffcafa6ad4a8fb Cherry-picked to 1.11: 18e4c1ba5b66c704b3ea7e5ba8fb7f3499207903 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai reassigned FLINK-19300: --- Assignee: Xiang Gao > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.8.0 >Reporter: Xiang Gao >Assignee: Xiang Gao >Priority: Blocker > Fix For: 1.12.0, 1.11.3 > > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19300: --- Affects Version/s: 1.8.0 > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.8.0 >Reporter: Xiang Gao >Priority: Blocker > Fix For: 1.12.0, 1.11.3 > > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19300: --- Fix Version/s: 1.11.3 1.12.0 > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > Fix For: 1.12.0, 1.11.3 > > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19300: --- Priority: Blocker (was: Critical) > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Blocker > Fix For: 1.12.0, 1.11.3 > > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229796#comment-17229796 ] Tzu-Li (Gordon) Tai edited comment on FLINK-19300 at 11/11/20, 7:09 AM: Just a comment on the severity of the issue: It looks like timer loss is only possible if somehow, the key groups contain a {{0}} at the very beginning of the stream. This seems to be the only possible case that would lead to the {{InternalTimerServiceSerializationProxy}} silently skipping the rest of the reads, instead of failing the restore with some {{IOException}}. was (Author: tzulitai): Just a comment on the severity of the issue: It looks like timer loss is only possible if somehow, the key groups contain a {{0}} at the very beginning of the stream. This seems to be the only possible case that would lead to the {{InternalTimerServiceSerializationProxy}} silently skipping the rest of the reads. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229796#comment-17229796 ] Tzu-Li (Gordon) Tai commented on FLINK-19300: - Just a comment on the severity of the issue: It looks like timer loss is only possible if somehow, the key groups contain a {{0}} at the very beginning of the stream. This seems to be the only possible case that would lead to the {{InternalTimerServiceSerializationProxy}} silently skipping the rest of the reads. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20084) Fix NPE when generating watermark for record whose rowtime field is null after watermark push down
Shengkai Fang created FLINK-20084: - Summary: Fix NPE when generating watermark for record whose rowtime field is null after watermark push down Key: FLINK-20084 URL: https://issues.apache.org/jira/browse/FLINK-20084 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.12.0 Reporter: Shengkai Fang Fix For: 1.12.0 The problem is from the class {{PushWatermarkIntoTableSourceScanRuleBase$DefaultWatermarkGeneratorSupplier$DefaultWatermarkGenerator#onEvent}}. It doesn't test whether the calculated watermark is null before set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229790#comment-17229790 ] Tzu-Li (Gordon) Tai edited comment on FLINK-19300 at 11/11/20, 7:06 AM: This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skipping reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the {{java.io.DataInput#readFully}} method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached / not enough bytes available. was (Author: tzulitai): This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skipping reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the {{java.io.DataInput#readFully}} method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229790#comment-17229790 ] Tzu-Li (Gordon) Tai edited comment on FLINK-19300 at 11/11/20, 7:05 AM: This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skipping reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the {{java.io.DataInput#readFully}} method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. was (Author: tzulitai): This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skipping reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229790#comment-17229790 ] Tzu-Li (Gordon) Tai edited comment on FLINK-19300 at 11/11/20, 7:05 AM: This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skipping reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. was (Author: tzulitai): This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skips reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229793#comment-17229793 ] Tzu-Li (Gordon) Tai commented on FLINK-19300: - As for priority of this issue: The bug looks like it's been there for quite a while already across many versions, but I'd suggest to list it as a blocker for 1.12.0 and 1.11.3. [~xianggao] Please do submit a PR for this, I'll try to review it as soon as possible. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229790#comment-17229790 ] Tzu-Li (Gordon) Tai commented on FLINK-19300: - This looks like a real issue, the typical {{read}} v.s. {{readFully}} problem. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skips reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229790#comment-17229790 ] Tzu-Li (Gordon) Tai edited comment on FLINK-19300 at 11/11/20, 7:03 AM: This looks like a real issue, the typical {{read}} v.s. {{readFully}} mistake. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skips reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. was (Author: tzulitai): This looks like a real issue, the typical {{read}} v.s. {{readFully}} problem. This read path should only occur for the case where users are using RocksDB backends + heap-based timers (using the heap backend should be fine, would not bump into this). [~xianggao] to help me understand the full problem: in your scenarios, I'm assuming that the timer losses are caused by somehow the {{InternalTimerServiceSerializationProxy}} silently skips reading the timers, instead of some {{IOException}} due to incorrect read attempts (and eventually fails the restore, instead of a timer loss). Could you clarify if my assumption is correct? As for the fix, I would suggest to try to reuse the `java.io.DataInput#readFully` method instead or re-implementing it: {code} byte[] tmp = new byte[VERSIONED_IDENTIFIER.length]; DataInputView inputView = new DataInputViewStreamWrapper(inputStream); inputView.readFully(tmp); {code} You can catch {{EOFException}} to determine if end of stream is reached. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (FLINK-20004) UpperLimitExceptionParameter description is misleading
[ https://issues.apache.org/jira/browse/FLINK-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Jiang updated FLINK-20004: --- Comment: was deleted (was: [~f.pompermaier], you could refer to MessageQueryParameter. In MessageQueryParameter, the maxExceptions is comma separated value. cc [~chesnay]) > UpperLimitExceptionParameter description is misleading > -- > > Key: FLINK-20004 > URL: https://issues.apache.org/jira/browse/FLINK-20004 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.11.2 >Reporter: Flavio Pompermaier >Priority: Trivial > > The maxExceptions query parameter of /jobs/:jobid/exceptions REST API is an > integer parameter, not a list of comma separated values..this is probably a > cut-and-paste error -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20083) OrcFsStreamingSinkITCase times out
[ https://issues.apache.org/jira/browse/FLINK-20083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-20083: --- Fix Version/s: 1.12.0 > OrcFsStreamingSinkITCase times out > -- > > Key: FLINK-20083 > URL: https://issues.apache.org/jira/browse/FLINK-20083 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8661=logs=66592496-52df-56bb-d03e-37509e1d9d0f=ae0269db-6796-5583-2e5f-d84757d711aa > {code} > [ERROR] > OrcFsStreamingSinkITCase>FsStreamingSinkITCaseBase.testPart:84->FsStreamingSinkITCaseBase.test:120->FsStreamingSinkITCaseBase.check:133 > » TestTimedOut > [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 26.871 s <<< FAILURE! - in org.apache.flink.orc.OrcFsStreamingSinkITCase > [ERROR] testPart(org.apache.flink.orc.OrcFsStreamingSinkITCase) Time > elapsed: 20.052 s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 20 seconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:114) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.check(FsStreamingSinkITCaseBase.scala:133) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.test(FsStreamingSinkITCaseBase.scala:120) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.testPart(FsStreamingSinkITCaseBase.scala:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20065) UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException
[ https://issues.apache.org/jira/browse/FLINK-20065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-20065: --- Assignee: Arvid Heise > UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException > --- > > Key: FLINK-20065 > URL: https://issues.apache.org/jira/browse/FLINK-20065 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.11.3 >Reporter: Dian Fu >Assignee: Arvid Heise >Priority: Blocker > Labels: test-stability > Fix For: 1.12.0, 1.11.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9362=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=45cc9205-bdb7-5b54-63cd-89fdc0983323 > {code} > 2020-11-09T22:19:47.2714024Z [ERROR] test[type: SAVEPOINT, startAligned: > true](org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase) > Time elapsed: 1.293 s <<< ERROR! > 2020-11-09T22:19:47.2715260Z java.util.concurrent.ExecutionException: > java.util.concurrent.TimeoutException: Invocation of public default > java.util.concurrent.CompletableFuture > org.apache.flink.runtime.webmonitor.RestfulGateway.stopWithSavepoint(org.apache.flink.api.common.JobID,java.lang.String,boolean,org.apache.flink.api.common.time.Time) > timed out. > 2020-11-09T22:19:47.2716743Z at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 2020-11-09T22:19:47.2718213Z at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > 2020-11-09T22:19:47.2719166Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.runAndTakeSavepoint(UnalignedCheckpointCompatibilityITCase.java:113) > 2020-11-09T22:19:47.2720278Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.test(UnalignedCheckpointCompatibilityITCase.java:97) > 2020-11-09T22:19:47.2721126Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-11-09T22:19:47.2721771Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-11-09T22:19:47.2722773Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-11-09T22:19:47.2723479Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-11-09T22:19:47.2724187Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-11-09T22:19:47.2725026Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-11-09T22:19:47.2725817Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-11-09T22:19:47.2726595Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-11-09T22:19:47.2727515Z at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > 2020-11-09T22:19:47.2728192Z at > org.junit.rules.RunRules.evaluate(RunRules.java:20) > 2020-11-09T22:19:47.2744089Z at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > 2020-11-09T22:19:47.2744907Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > 2020-11-09T22:19:47.2745573Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > 2020-11-09T22:19:47.2746037Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-09T22:19:47.2746445Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-09T22:19:47.2746868Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-09T22:19:47.2747443Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-09T22:19:47.2747876Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-11-09T22:19:47.2748297Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-11-09T22:19:47.2748694Z at > org.junit.runners.Suite.runChild(Suite.java:128) > 2020-11-09T22:19:47.2749054Z at > org.junit.runners.Suite.runChild(Suite.java:27) > 2020-11-09T22:19:47.2749414Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-09T22:19:47.2749819Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-09T22:19:47.2750373Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-09T22:19:47.2750923Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-09T22:19:47.2751555Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-11-09T22:19:47.2752148Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-11-09T22:19:47.2752938Z at >
[jira] [Created] (FLINK-20083) OrcFsStreamingSinkITCase times out
Robert Metzger created FLINK-20083: -- Summary: OrcFsStreamingSinkITCase times out Key: FLINK-20083 URL: https://issues.apache.org/jira/browse/FLINK-20083 Project: Flink Issue Type: Improvement Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.12.0 Reporter: Robert Metzger https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8661=logs=66592496-52df-56bb-d03e-37509e1d9d0f=ae0269db-6796-5583-2e5f-d84757d711aa {code} [ERROR] OrcFsStreamingSinkITCase>FsStreamingSinkITCaseBase.testPart:84->FsStreamingSinkITCaseBase.test:120->FsStreamingSinkITCaseBase.check:133 » TestTimedOut [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.871 s <<< FAILURE! - in org.apache.flink.orc.OrcFsStreamingSinkITCase [ERROR] testPart(org.apache.flink.orc.OrcFsStreamingSinkITCase) Time elapsed: 20.052 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 20 seconds at java.lang.Thread.sleep(Native Method) at org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231) at org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119) at org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) at org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) at org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) at org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) at java.util.Iterator.forEachRemaining(Iterator.java:115) at org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:114) at org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.check(FsStreamingSinkITCaseBase.scala:133) at org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.test(FsStreamingSinkITCaseBase.scala:120) at org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.testPart(FsStreamingSinkITCaseBase.scala:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20083) OrcFsStreamingSinkITCase times out
[ https://issues.apache.org/jira/browse/FLINK-20083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-20083: --- Priority: Critical (was: Major) > OrcFsStreamingSinkITCase times out > -- > > Key: FLINK-20083 > URL: https://issues.apache.org/jira/browse/FLINK-20083 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Priority: Critical > Labels: test-stability > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8661=logs=66592496-52df-56bb-d03e-37509e1d9d0f=ae0269db-6796-5583-2e5f-d84757d711aa > {code} > [ERROR] > OrcFsStreamingSinkITCase>FsStreamingSinkITCaseBase.testPart:84->FsStreamingSinkITCaseBase.test:120->FsStreamingSinkITCaseBase.check:133 > » TestTimedOut > [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 26.871 s <<< FAILURE! - in org.apache.flink.orc.OrcFsStreamingSinkITCase > [ERROR] testPart(org.apache.flink.orc.OrcFsStreamingSinkITCase) Time > elapsed: 20.052 s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 20 seconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:114) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.check(FsStreamingSinkITCaseBase.scala:133) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.test(FsStreamingSinkITCaseBase.scala:120) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.testPart(FsStreamingSinkITCaseBase.scala:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20083) OrcFsStreamingSinkITCase times out
[ https://issues.apache.org/jira/browse/FLINK-20083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-20083: --- Issue Type: Bug (was: Improvement) > OrcFsStreamingSinkITCase times out > -- > > Key: FLINK-20083 > URL: https://issues.apache.org/jira/browse/FLINK-20083 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Priority: Major > Labels: test-stability > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8661=logs=66592496-52df-56bb-d03e-37509e1d9d0f=ae0269db-6796-5583-2e5f-d84757d711aa > {code} > [ERROR] > OrcFsStreamingSinkITCase>FsStreamingSinkITCaseBase.testPart:84->FsStreamingSinkITCaseBase.test:120->FsStreamingSinkITCaseBase.check:133 > » TestTimedOut > [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 26.871 s <<< FAILURE! - in org.apache.flink.orc.OrcFsStreamingSinkITCase > [ERROR] testPart(org.apache.flink.orc.OrcFsStreamingSinkITCase) Time > elapsed: 20.052 s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 20 seconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sleepBeforeRetry(CollectResultFetcher.java:231) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:119) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:114) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.check(FsStreamingSinkITCaseBase.scala:133) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.test(FsStreamingSinkITCaseBase.scala:120) > at > org.apache.flink.table.planner.runtime.stream.FsStreamingSinkITCaseBase.testPart(FsStreamingSinkITCaseBase.scala:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise commented on a change in pull request #14024: [FLINK-20079][task] Fix StreamTask initialization order
AHeise commented on a change in pull request #14024: URL: https://github.com/apache/flink/pull/14024#discussion_r521147841 ## File path: flink-tests/src/test/java/org/apache/flink/test/checkpointing/UnalignedCheckpointITCase.java ## @@ -531,8 +534,10 @@ public void invoke(Long value, Context context) throws Exception { state.numOutput++; if (state.completedCheckpoints < minCheckpoints) { - // induce heavy backpressure until enough checkpoints have been written - Thread.sleep(0, 100_000); + // induce backpressure until enough checkpoints have been written + if (random.nextInt(1000) == 42) { Review comment: Why do we need to reduce the backpressure here? I'm worried that any solution on top of it will only work with a certain minimal flow. Note that in https://github.com/apache/flink/pull/13845/commits/ef4f78d9b7bedb10a8c7be3e062032d2e7a0c77c#diff-c0d448d637f04cc203de3592d7fab0ba2412f2d75c59dcf6e44cc4a3c18e5851R568 I added a modification to the test that avoids backpressuring during recovery (by applying backpressure only after the first successful checkpoint. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20046) StreamTableAggregateTests.test_map_view_iterate is instable
[ https://issues.apache.org/jira/browse/FLINK-20046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229786#comment-17229786 ] Robert Metzger commented on FLINK-20046: https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8661=logs=584fa981-f71a-5840-1c49-f800c954fe4b=532bf1f8-8c75-59c3-eaad-8c773769bc3a > StreamTableAggregateTests.test_map_view_iterate is instable > --- > > Key: FLINK-20046 > URL: https://issues.apache.org/jira/browse/FLINK-20046 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.12.0 >Reporter: Dian Fu >Assignee: Wei Zhong >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9279=logs=821b528f-1eed-5598-a3b4-7f748b13f261=4fad9527-b9a5-5015-1b70-8356e5c91490 > {code} > 2020-11-07T22:50:57.4180758Z ___ > StreamTableAggregateTests.test_map_view_iterate > 2020-11-07T22:50:57.4181301Z > 2020-11-07T22:50:57.4181965Z self = > testMethod=test_map_view_iterate> > 2020-11-07T22:50:57.4182348Z > 2020-11-07T22:50:57.4182535Z def test_map_view_iterate(self): > 2020-11-07T22:50:57.4182812Z test_iterate = > udaf(TestIterateAggregateFunction()) > 2020-11-07T22:50:57.4183320Z > self.t_env.get_config().set_idle_state_retention(datetime.timedelta(days=1)) > 2020-11-07T22:50:57.4183763Z > self.t_env.get_config().get_configuration().set_string( > 2020-11-07T22:50:57.4297555Z "python.fn-execution.bundle.size", > "2") > 2020-11-07T22:50:57.4297922Z # trigger the cache eviction in a bundle. > 2020-11-07T22:50:57.4308028Z > self.t_env.get_config().get_configuration().set_string( > 2020-11-07T22:50:57.4308653Z "python.state.cache-size", "2") > 2020-11-07T22:50:57.4308945Z > self.t_env.get_config().get_configuration().set_string( > 2020-11-07T22:50:57.4309382Z "python.map-state.read-cache-size", > "2") > 2020-11-07T22:50:57.4309676Z > self.t_env.get_config().get_configuration().set_string( > 2020-11-07T22:50:57.4310428Z "python.map-state.write-cache-size", > "2") > 2020-11-07T22:50:57.4310701Z > self.t_env.get_config().get_configuration().set_string( > 2020-11-07T22:50:57.4311130Z > "python.map-state.iterate-response-batch-size", "2") > 2020-11-07T22:50:57.4311361Z t = self.t_env.from_elements( > 2020-11-07T22:50:57.4311691Z [(1, 'Hi_', 'hi'), > 2020-11-07T22:50:57.4312004Z (1, 'Hi', 'hi'), > 2020-11-07T22:50:57.4312316Z (2, 'hello', 'hello'), > 2020-11-07T22:50:57.4312639Z (3, 'Hi_', 'hi'), > 2020-11-07T22:50:57.4312975Z (3, 'Hi', 'hi'), > 2020-11-07T22:50:57.4313285Z (4, 'hello', 'hello'), > 2020-11-07T22:50:57.4313609Z (5, 'Hi2_', 'hi'), > 2020-11-07T22:50:57.4313908Z (5, 'Hi2', 'hi'), > 2020-11-07T22:50:57.4314238Z (6, 'hello2', 'hello'), > 2020-11-07T22:50:57.4314558Z (7, 'Hi', 'hi'), > 2020-11-07T22:50:57.4315053Z (8, 'hello', 'hello'), > 2020-11-07T22:50:57.4315396Z (9, 'Hi2', 'hi'), > 2020-11-07T22:50:57.4315773Z (13, 'Hi3', 'hi')], ['a', 'b', 'c']) > 2020-11-07T22:50:57.4316023Z > self.t_env.create_temporary_view("source", t) > 2020-11-07T22:50:57.4316299Z table_with_retract_message = > self.t_env.sql_query( > 2020-11-07T22:50:57.4316615Z "select LAST_VALUE(b) as b, > LAST_VALUE(c) as c from source group by a") > 2020-11-07T22:50:57.4316919Z result = > table_with_retract_message.group_by(t.c) \ > 2020-11-07T22:50:57.4317197Z > .select(test_iterate(t.b).alias("a"), t.c) \ > 2020-11-07T22:50:57.4317619Z .select(col("a").get(0).alias("a"), > 2020-11-07T22:50:57.4318111Z col("a").get(1).alias("b"), > 2020-11-07T22:50:57.4318357Z col("a").get(2).alias("c"), > 2020-11-07T22:50:57.4318586Z col("a").get(3).alias("d"), > 2020-11-07T22:50:57.4318814Z t.c.alias("e")) > 2020-11-07T22:50:57.4319023Z assert_frame_equal( > 2020-11-07T22:50:57.4319208Z > result.to_pandas(), > 2020-11-07T22:50:57.4319408Z pd.DataFrame([ > 2020-11-07T22:50:57.4319872Z ["hello,hello2", "1,3", > 'hello:3,hello2:1', 2, "hello"], > 2020-11-07T22:50:57.4320398Z ["Hi,Hi2,Hi3", "1,2,3", > "Hi:3,Hi2:2,Hi3:1", 3, "hi"]], > 2020-11-07T22:50:57.4321047Z columns=['a', 'b', 'c', 'd', > 'e'])) > 2020-11-07T22:50:57.4321198Z > 2020-11-07T22:50:57.4321385Z pyflink/table/tests/test_aggregate.py:468: >
[jira] [Commented] (FLINK-20004) UpperLimitExceptionParameter description is misleading
[ https://issues.apache.org/jira/browse/FLINK-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229784#comment-17229784 ] Nicholas Jiang commented on FLINK-20004: [~f.pompermaier], you could refer to MessageQueryParameter. In MessageQueryParameter, the maxExceptions is comma separated value. cc [~chesnay] > UpperLimitExceptionParameter description is misleading > -- > > Key: FLINK-20004 > URL: https://issues.apache.org/jira/browse/FLINK-20004 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.11.2 >Reporter: Flavio Pompermaier >Priority: Trivial > > The maxExceptions query parameter of /jobs/:jobid/exceptions REST API is an > integer parameter, not a list of comma separated values..this is probably a > cut-and-paste error -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229782#comment-17229782 ] david weinstein commented on FLINK-19300: - We're using Flink version 1.8. > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19300) Timer loss after restoring from savepoint
[ https://issues.apache.org/jira/browse/FLINK-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229780#comment-17229780 ] Robert Metzger commented on FLINK-19300: Thanks a lot for reporting this. Which Flink version are you using? > Timer loss after restoring from savepoint > - > > Key: FLINK-19300 > URL: https://issues.apache.org/jira/browse/FLINK-19300 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Xiang Gao >Priority: Critical > > While using heap-based timers, we are seeing occasional timer loss after > restoring program from savepoint, especially when using a remote savepoint > storage (s3). > After some investigation, the issue seems to be related to [this line in > deserialization|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/io/PostVersionedIOReadableWritable.java#L65]. > When trying to check the VERSIONED_IDENTIFIER, the input stream may not > guarantee filling the byte array, causing timers to be dropped for the > affected key group. > Should keep reading until expected number of bytes are actually read or if > end of the stream has been reached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19964) Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph
[ https://issues.apache.org/jira/browse/FLINK-19964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229777#comment-17229777 ] Arvid Heise commented on FLINK-19964: - Yes sorry about that, but I realized that exactly that question was unanswered ;). Merged the fix into master as 18ffebb3dbecc21d2d33c436628176c4971cebbd. Backport not applicable. > Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph > > > Key: FLINK-19964 > URL: https://issues.apache.org/jira/browse/FLINK-19964 > Project: Flink > Issue Type: Bug > Components: Library / Graph Processing (Gelly), Runtime / Network, > Tests >Affects Versions: 1.12.0 >Reporter: Chesnay Schepler >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > The HITSITCase has gotten stuck on Azure. Chances are that something in the > scheduling or network has broken it. > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8919=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-19964) Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph
[ https://issues.apache.org/jira/browse/FLINK-19964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-19964. - Resolution: Fixed > Gelly ITCase stuck on Azure in HITSITCase.testPrintWithRMatGraph > > > Key: FLINK-19964 > URL: https://issues.apache.org/jira/browse/FLINK-19964 > Project: Flink > Issue Type: Bug > Components: Library / Graph Processing (Gelly), Runtime / Network, > Tests >Affects Versions: 1.12.0 >Reporter: Chesnay Schepler >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > The HITSITCase has gotten stuck on Azure. Chances are that something in the > scheduling or network has broken it. > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8919=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] AHeise merged pull request #14023: [FLINK-19964][network] Avoid proactively pulling segments in destroyed LocalBufferPool.
AHeise merged pull request #14023: URL: https://github.com/apache/flink/pull/14023 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14027: [FLINK-20074][table-planner-blink] Fix can't generate plan when joining on changelog source without updates
flinkbot edited a comment on pull request #14027: URL: https://github.com/apache/flink/pull/14027#issuecomment-725222729 ## CI report: * 6d29e26c756f853e68d2930a2fc9cdbf41174f2f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9442) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14028: [FLINK-20020][client] Make UnsuccessfulExecutionException part of the JobClient.getJobExecutionResult() contract
flinkbot edited a comment on pull request #14028: URL: https://github.com/apache/flink/pull/14028#issuecomment-725222918 ## CI report: * 0e105679613efd2a1ccc80b678445bbcc23042cb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9443) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20082) Cannot start Flink application due to "cannot assign instance of java.lang.invoke.SerializedLambda to type scala.Function1
ChangZhuo Chen (陳昌倬) created FLINK-20082: Summary: Cannot start Flink application due to "cannot assign instance of java.lang.invoke.SerializedLambda to type scala.Function1 Key: FLINK-20082 URL: https://issues.apache.org/jira/browse/FLINK-20082 Project: Flink Issue Type: Bug Components: API / DataStream, API / Scala Affects Versions: 1.11.2 Environment: * Flink 1.11.2 * Java 11 * Scala 2.12.11 Reporter: ChangZhuo Chen (陳昌倬) Attachments: image-20201110-060934.png Hi, * Our Flink application (Java 11 + Scala 2.12) has the following problem when executing it. It cannot be run in Flink cluster. * The problem is similar to https://issues.apache.org/jira/browse/SPARK-25047, so maybe the same fix shall be implemented in Flink? !image-20201110-060934.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] curcur closed pull request #13677: Single task scheduler
curcur closed pull request #13677: URL: https://github.com/apache/flink/pull/13677 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19044) Web UI reports incorrect timeline for the FINISHED state of a subtask
[ https://issues.apache.org/jira/browse/FLINK-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229766#comment-17229766 ] Robert Metzger commented on FLINK-19044: Is this a problem of the REST API, or is the UI mixing up the data? > Web UI reports incorrect timeline for the FINISHED state of a subtask > - > > Key: FLINK-19044 > URL: https://issues.apache.org/jira/browse/FLINK-19044 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.12.0 >Reporter: Caizhi Weng >Priority: Major > Fix For: 1.12.0 > > Attachments: timeline.jpg > > > The timeline for the FINISHED state of a subtask is incorrect. The starting > time of the FINISHED state is larger than its ending time. See the image > below: > !timeline.jpg! > I discover this bug when running the TPCDS benchmark, but I think it can be > reproduced by arbitrary tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] becketqin commented on pull request #14022: [FLINK-20068] Enhance the topic creation guarantee to ensure all the …
becketqin commented on pull request #14022: URL: https://github.com/apache/flink/pull/14022#issuecomment-725224684 @PatrickRen @Sxnan Thanks for helping verify the fix. I'll merge the PR now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14028: [FLINK-20020][client] Make UnsuccessfulExecutionException part of the JobClient.getJobExecutionResult() contract
flinkbot commented on pull request #14028: URL: https://github.com/apache/flink/pull/14028#issuecomment-725222918 ## CI report: * 0e105679613efd2a1ccc80b678445bbcc23042cb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20068) KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
[ https://issues.apache.org/jira/browse/FLINK-20068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229763#comment-17229763 ] Jiangjie Qin commented on FLINK-20068: -- This is a legacy issue caused by asynchronous topic creation in Kafka. The problems is that we have 3 Kafka brokers. When a topic is created, the {{CreateTopicsRequest}} will be sent to one of the brokers and block waiting until that broker receives a metadata cache update. However, this does not guarantee that the other two brokers have also received the metadata cache update. When a subsequent TopicMetadataRequest goes to the other two brokers, it is still possible that the topic being created is not returned in the topic metadata. The patch fixes this by letting the {{KafkaTestEnvironmentImpl}} to look into each of the brokers and wait until all the brokers have got the newly created topic in their metadata cache before it returns from {{createTestTopic()}} call. This should help fix a few other intermittent test failures. > KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results > - > > Key: FLINK-20068 > URL: https://issues.apache.org/jira/browse/FLINK-20068 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.0 >Reporter: Dian Fu >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9365=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 > {code} > 2020-11-10T00:14:22.7658242Z [ERROR] > testTopicPatternSubscriber(org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest) > Time elapsed: 0.012 s <<< FAILURE! > 2020-11-10T00:14:22.7659838Z java.lang.AssertionError: > expected:<[pattern-topic-5, pattern-topic-4, pattern-topic-7, > pattern-topic-6, pattern-topic-9, pattern-topic-8, pattern-topic-1, > pattern-topic-0, pattern-topic-3]> but was:<[]> > 2020-11-10T00:14:22.7660740Z at org.junit.Assert.fail(Assert.java:88) > 2020-11-10T00:14:22.7661245Z at > org.junit.Assert.failNotEquals(Assert.java:834) > 2020-11-10T00:14:22.7661788Z at > org.junit.Assert.assertEquals(Assert.java:118) > 2020-11-10T00:14:22.7662312Z at > org.junit.Assert.assertEquals(Assert.java:144) > 2020-11-10T00:14:22.7663051Z at > org.apache.flink.connector.kafka.source.enumerator.subscriber.KafkaSubscriberTest.testTopicPatternSubscriber(KafkaSubscriberTest.java:94) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14027: [FLINK-20074][table-planner-blink] Fix can't generate plan when joining on changelog source without updates
flinkbot commented on pull request #14027: URL: https://github.com/apache/flink/pull/14027#issuecomment-725222729 ## CI report: * 6d29e26c756f853e68d2930a2fc9cdbf41174f2f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20074) ChangelogSourceITCase.testRegularJoin fail
[ https://issues.apache.org/jira/browse/FLINK-20074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-20074: --- Labels: pull-request-available test-stability (was: test-stability) > ChangelogSourceITCase.testRegularJoin fail > -- > > Key: FLINK-20074 > URL: https://issues.apache.org/jira/browse/FLINK-20074 > Project: Flink > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Jingsong Lee >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > INSTANCE: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9379=logs=ae4f8708-9994-57d3-c2d7-b892156e7812=e25d5e7e-2a9c-5589-4940-0b638d75a414 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-13733) FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin resolved FLINK-13733. -- Fix Version/s: 1.11.3 Resolution: Fixed > FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis > -- > > Key: FLINK-13733 > URL: https://issues.apache.org/jira/browse/FLINK-13733 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka, Tests >Affects Versions: 1.9.0, 1.10.0, 1.11.0, 1.12.0 >Reporter: Till Rohrmann >Assignee: Jiangjie Qin >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0, 1.11.3 > > Attachments: 20200421.13.tar.gz > > > The {{FlinkKafkaInternalProducerITCase.testHappyPath}} fails on Travis with > {code} > Test > testHappyPath(org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase) > failed with: > java.util.NoSuchElementException > at > org.apache.kafka.common.utils.AbstractIterator.next(AbstractIterator.java:52) > at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302) > at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.assertRecord(FlinkKafkaInternalProducerITCase.java:169) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.testHappyPath(FlinkKafkaInternalProducerITCase.java:70) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > https://api.travis-ci.org/v3/job/571870358/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14028: [FLINK-20020][client] Make UnsuccessfulExecutionException part of the JobClient.getJobExecutionResult() contract
flinkbot commented on pull request #14028: URL: https://github.com/apache/flink/pull/14028#issuecomment-725219422 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 0e105679613efd2a1ccc80b678445bbcc23042cb (Wed Nov 11 06:01:14 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-13733) FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229761#comment-17229761 ] Jiangjie Qin commented on FLINK-13733: -- The patch has been merged to master. cb2d137adc9ea1e46a67513c1e0f2469bb05bff4 I am closing the ticket. Hopefully we don't see the unstability again. > FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis > -- > > Key: FLINK-13733 > URL: https://issues.apache.org/jira/browse/FLINK-13733 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka, Tests >Affects Versions: 1.9.0, 1.10.0, 1.11.0, 1.12.0 >Reporter: Till Rohrmann >Assignee: Jiangjie Qin >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: 20200421.13.tar.gz > > > The {{FlinkKafkaInternalProducerITCase.testHappyPath}} fails on Travis with > {code} > Test > testHappyPath(org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase) > failed with: > java.util.NoSuchElementException > at > org.apache.kafka.common.utils.AbstractIterator.next(AbstractIterator.java:52) > at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302) > at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.assertRecord(FlinkKafkaInternalProducerITCase.java:169) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.testHappyPath(FlinkKafkaInternalProducerITCase.java:70) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > https://api.travis-ci.org/v3/job/571870358/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20020) Make UnsuccessfulExecutionException part of the JobClient.getJobExecutionResult() contract.
[ https://issues.apache.org/jira/browse/FLINK-20020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-20020: --- Labels: pull-request-available (was: ) > Make UnsuccessfulExecutionException part of the > JobClient.getJobExecutionResult() contract. > --- > > Key: FLINK-20020 > URL: https://issues.apache.org/jira/browse/FLINK-20020 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission >Affects Versions: 1.12.0 >Reporter: Kostas Kloudas >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > > Currently, different implementations of the {{JobClient}} throw different > exceptions. The {{ClusterClientJobClientAdapter}} wraps the exception from > the {{JobResult.toJobExecutionResult()}} into a > {{ProgramInvocationException}}, the {{MiniClusterJobClient}} simply wraps it > in a {{CompletionException}} and the {{EmbeddedJobClient}} wraps it into an > {{UnsuccessfulExecutionException}}. > With this issue I would like to propose making the exception uniform and part > of the contract and as a candidate I would propose the behaviour of the > {{EmbeddedJobClient}} which throws an {{UnsuccessfulExecutionException}}. The > reason is that this exception also includes the status of the application. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] SteNicholas opened a new pull request #14028: [FLINK-20020][client] Make UnsuccessfulExecutionException part of the JobClient.getJobExecutionResult() contract
SteNicholas opened a new pull request #14028: URL: https://github.com/apache/flink/pull/14028 ## What is the purpose of the change *Currently, different implementations of the `JobClient` throw different exceptions. The `ClusterClientJobClientAdapter` wraps the exception from the `JobResult.toJobExecutionResult()` into a `ProgramInvocationException`, the `MiniClusterJobClient` simply wraps it in a `CompletionException` and the `EmbeddedJobClient` wraps it into an `UnsuccessfulExecutionException`. `JobResult.toJobExecutionResult()` should make the exception uniform and part of the contract through `JobExecutionResultException` which exception includes the status of the application.* ## Brief change log - *Add `JobExecutionResultException` to make the exception uniform and part of the contract for `JobResult.toJobExecutionResult()`.* - *`ClusterClientJobClientAdapter`, `MiniClusterJobClient` and `EmbeddedJobClient` uniform the exception to `JobExecutionResultException`.* ## Verifying this change - *Modify `ApplicationDispatcherBootstrapTest` and `PerJobMiniClusterFactoryTest.java` to update the `UnsuccessfulExecutionException` check to the `JobExecutionResultException`.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14027: [FLINK-20074][table-planner-blink] Fix can't generate plan when joining on changelog source without updates
flinkbot commented on pull request #14027: URL: https://github.com/apache/flink/pull/14027#issuecomment-725218189 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 6d29e26c756f853e68d2930a2fc9cdbf41174f2f (Wed Nov 11 05:57:45 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] becketqin closed pull request #14021: [FLINK-13733][connector/kafka] Make FlinkKafkaInternalProducerITCase more robust
becketqin closed pull request #14021: URL: https://github.com/apache/flink/pull/14021 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] becketqin commented on pull request #14021: [FLINK-13733][connector/kafka] Make FlinkKafkaInternalProducerITCase more robust
becketqin commented on pull request #14021: URL: https://github.com/apache/flink/pull/14021#issuecomment-725218181 Merged to master. cb2d137adc9ea1e46a67513c1e0f2469bb05bff4 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong opened a new pull request #14027: [FLINK-20074][table-planner-blink] Fix can't generate plan when joini…
wuchong opened a new pull request #14027: URL: https://github.com/apache/flink/pull/14027 …ng on changelog source without updates ## What is the purpose of the change This PR fixes the exception when get the plan of JOIN on changelog source which doesn't contain updates. ``` [ERROR] testRegularJoin[Source=NO_UPDATE, StateBackend=MiniBatch=OFF](org.apache.flink.table.planner.runtime.stream.sql.ChangelogSourceITCase) Time elapsed: 0.047 s <<< ERROR! 2020-11-10T06:39:13.0856712Z org.apache.flink.table.api.TableException: 2020-11-10T06:39:13.0857161Z UpdateKindTrait [ONLY_UPDATE_AFTER] conflicts with ModifyKindSetTrait [I,D]. This is a bug in planner, please file an issue. 2020-11-10T06:39:13.0858003Z Current node is Join(joinType=[InnerJoin], where=[(currency = currency0)], select=[amount, currency, currency0, rate], leftInputSpec=[NoUniqueKey], rightInputSpec=[NoUniqueKey]). ``` ## Brief change log - Fix `FlinkChangelogModeInferenceProgram`. ## Verifying this change - Add `testJoinOnNoUpdateSource` test which can reproduce this problem. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] becketqin commented on pull request #14021: [FLINK-13733][connector/kafka] Make FlinkKafkaInternalProducerITCase more robust
becketqin commented on pull request #14021: URL: https://github.com/apache/flink/pull/14021#issuecomment-725216164 @PatrickRen Thanks a lot for the thorough tests. The patch LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20081) ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
[ https://issues.apache.org/jira/browse/FLINK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated FLINK-20081: - Fix Version/s: 1.12.0 > ExecutorNotifier should run handler in the main thread when receive an > exception from the callable. > --- > > Key: FLINK-20081 > URL: https://issues.apache.org/jira/browse/FLINK-20081 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Reporter: Jiangjie Qin >Priority: Major > Fix For: 1.12.0 > > > Currently the {{ExecutorNotifier}} runs the {{handler}} in the worker thread > if there is an exception thrown from the {{callable}}. This breaks the > threading model and prevents an exception from bubbling up to fail the job. > Another issue is that right now, when an exception bubbles up from the > {{SourceCoordinator}}, the UncaughtExceptionHandler will call > System.exit(-17) and kill the JM. This is too much. Instead, we should just > fail the job to trigger a failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20081) ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
[ https://issues.apache.org/jira/browse/FLINK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated FLINK-20081: - Affects Version/s: 1.11.2 > ExecutorNotifier should run handler in the main thread when receive an > exception from the callable. > --- > > Key: FLINK-20081 > URL: https://issues.apache.org/jira/browse/FLINK-20081 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.11.2 >Reporter: Jiangjie Qin >Priority: Major > Fix For: 1.12.0 > > > Currently the {{ExecutorNotifier}} runs the {{handler}} in the worker thread > if there is an exception thrown from the {{callable}}. This breaks the > threading model and prevents an exception from bubbling up to fail the job. > Another issue is that right now, when an exception bubbles up from the > {{SourceCoordinator}}, the UncaughtExceptionHandler will call > System.exit(-17) and kill the JM. This is too much. Instead, we should just > fail the job to trigger a failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20081) ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
[ https://issues.apache.org/jira/browse/FLINK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated FLINK-20081: - Fix Version/s: 1.11.3 > ExecutorNotifier should run handler in the main thread when receive an > exception from the callable. > --- > > Key: FLINK-20081 > URL: https://issues.apache.org/jira/browse/FLINK-20081 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.11.2 >Reporter: Jiangjie Qin >Priority: Major > Fix For: 1.12.0, 1.11.3 > > > Currently the {{ExecutorNotifier}} runs the {{handler}} in the worker thread > if there is an exception thrown from the {{callable}}. This breaks the > threading model and prevents an exception from bubbling up to fail the job. > Another issue is that right now, when an exception bubbles up from the > {{SourceCoordinator}}, the UncaughtExceptionHandler will call > System.exit(-17) and kill the JM. This is too much. Instead, we should just > fail the job to trigger a failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20081) ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
Jiangjie Qin created FLINK-20081: Summary: ExecutorNotifier should run handler in the main thread when receive an exception from the callable. Key: FLINK-20081 URL: https://issues.apache.org/jira/browse/FLINK-20081 Project: Flink Issue Type: Bug Components: Connectors / Common Reporter: Jiangjie Qin Currently the {{ExecutorNotifier}} runs the {{handler}} in the worker thread if there is an exception thrown from the {{callable}}. This breaks the threading model and prevents an exception from bubbling up to fail the job. Another issue is that right now, when an exception bubbles up from the {{SourceCoordinator}}, the UncaughtExceptionHandler will call System.exit(-17) and kill the JM. This is too much. Instead, we should just fail the job to trigger a failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20081) ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
[ https://issues.apache.org/jira/browse/FLINK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin reassigned FLINK-20081: Assignee: Jiangjie Qin > ExecutorNotifier should run handler in the main thread when receive an > exception from the callable. > --- > > Key: FLINK-20081 > URL: https://issues.apache.org/jira/browse/FLINK-20081 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.11.2 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin >Priority: Major > Fix For: 1.12.0, 1.11.3 > > > Currently the {{ExecutorNotifier}} runs the {{handler}} in the worker thread > if there is an exception thrown from the {{callable}}. This breaks the > threading model and prevents an exception from bubbling up to fail the job. > Another issue is that right now, when an exception bubbles up from the > {{SourceCoordinator}}, the UncaughtExceptionHandler will call > System.exit(-17) and kill the JM. This is too much. Instead, we should just > fail the job to trigger a failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20065) UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException
[ https://issues.apache.org/jira/browse/FLINK-20065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229747#comment-17229747 ] Dian Fu commented on FLINK-20065: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9431=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=45cc9205-bdb7-5b54-63cd-89fdc0983323 > UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException > --- > > Key: FLINK-20065 > URL: https://issues.apache.org/jira/browse/FLINK-20065 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.11.3 >Reporter: Dian Fu >Priority: Blocker > Labels: test-stability > Fix For: 1.12.0, 1.11.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9362=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=45cc9205-bdb7-5b54-63cd-89fdc0983323 > {code} > 2020-11-09T22:19:47.2714024Z [ERROR] test[type: SAVEPOINT, startAligned: > true](org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase) > Time elapsed: 1.293 s <<< ERROR! > 2020-11-09T22:19:47.2715260Z java.util.concurrent.ExecutionException: > java.util.concurrent.TimeoutException: Invocation of public default > java.util.concurrent.CompletableFuture > org.apache.flink.runtime.webmonitor.RestfulGateway.stopWithSavepoint(org.apache.flink.api.common.JobID,java.lang.String,boolean,org.apache.flink.api.common.time.Time) > timed out. > 2020-11-09T22:19:47.2716743Z at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 2020-11-09T22:19:47.2718213Z at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > 2020-11-09T22:19:47.2719166Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.runAndTakeSavepoint(UnalignedCheckpointCompatibilityITCase.java:113) > 2020-11-09T22:19:47.2720278Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointCompatibilityITCase.test(UnalignedCheckpointCompatibilityITCase.java:97) > 2020-11-09T22:19:47.2721126Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-11-09T22:19:47.2721771Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-11-09T22:19:47.2722773Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-11-09T22:19:47.2723479Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-11-09T22:19:47.2724187Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-11-09T22:19:47.2725026Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-11-09T22:19:47.2725817Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-11-09T22:19:47.2726595Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-11-09T22:19:47.2727515Z at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > 2020-11-09T22:19:47.2728192Z at > org.junit.rules.RunRules.evaluate(RunRules.java:20) > 2020-11-09T22:19:47.2744089Z at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > 2020-11-09T22:19:47.2744907Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > 2020-11-09T22:19:47.2745573Z at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > 2020-11-09T22:19:47.2746037Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-09T22:19:47.2746445Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-09T22:19:47.2746868Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-09T22:19:47.2747443Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-09T22:19:47.2747876Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > 2020-11-09T22:19:47.2748297Z at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > 2020-11-09T22:19:47.2748694Z at > org.junit.runners.Suite.runChild(Suite.java:128) > 2020-11-09T22:19:47.2749054Z at > org.junit.runners.Suite.runChild(Suite.java:27) > 2020-11-09T22:19:47.2749414Z at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > 2020-11-09T22:19:47.2749819Z at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > 2020-11-09T22:19:47.2750373Z at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > 2020-11-09T22:19:47.2750923Z at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > 2020-11-09T22:19:47.2751555Z at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) >
[jira] [Commented] (FLINK-17096) Minibatch Group Agg support state ttl
[ https://issues.apache.org/jira/browse/FLINK-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229745#comment-17229745 ] Dian Fu commented on FLINK-17096: - [~jark] Thanks a lot~ > Minibatch Group Agg support state ttl > - > > Key: FLINK-17096 > URL: https://issues.apache.org/jira/browse/FLINK-17096 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Runtime >Affects Versions: 1.9.0, 1.10.0, 1.11.0 >Reporter: dalongliu >Assignee: dalongliu >Priority: Critical > Labels: pull-request-available > Fix For: 1.12.0 > > > At the moment, MiniBatch Group Agg include Local/Global doesn`t support State > TTL, for streaming job, it will lead to OOM in long time running, so we need > to make state data expire after ttl, the solution is that use incremental > cleanup feature refer to FLINK-16581 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19630) Sink data in [ORC] format into Hive By using Legacy Table API caused unexpected Exception
[ https://issues.apache.org/jira/browse/FLINK-19630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229744#comment-17229744 ] Dian Fu commented on FLINK-19630: - [~neighborhood] Thanks a lot for the feedback. [~lzljs3620320] Agree with you ~. Adding some documentation is a good idea. > Sink data in [ORC] format into Hive By using Legacy Table API caused > unexpected Exception > -- > > Key: FLINK-19630 > URL: https://issues.apache.org/jira/browse/FLINK-19630 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.11.2 >Reporter: Lsw_aka_laplace >Priority: Critical > Fix For: 1.12.0 > > Attachments: image-2020-10-14-11-36-48-086.png, > image-2020-10-14-11-41-53-379.png, image-2020-10-14-11-42-57-353.png, > image-2020-10-14-11-48-51-310.png > > > *ENV:* > *Flink version 1.11.2* > *Hive exec version: 2.0.1* > *Hive file storing type :ORC* > *SQL or Datastream: SQL API* > *Kafka Connector : custom Kafka connector which is based on Legacy API > (TableSource/`org.apache.flink.types.Row`)* > *Hive Connector : totally follows the Flink-Hive-connector (we only made some > encapsulation upon it)* > *Using StreamingFileCommitter:YES* > > > *Description:* > try to execute the following SQL: > """ > insert into hive_table (select * from kafka_table) > """ > HIVE Table SQL seems like: > """ > CREATE TABLE `hive_table`( > // some fields > PARTITIONED BY ( > `dt` string, > `hour` string) > STORED AS orc > TBLPROPERTIES ( > 'orc.compress'='SNAPPY', > 'type'='HIVE', > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.delay' = '1 h', > 'sink.partition-commit.policy.kind' = 'metastore,success-file', > ) > """ > When this job starts to process snapshot, here comes the weird exception: > !image-2020-10-14-11-36-48-086.png|width=882,height=395! > As we can see from the message:Owner thread shall be the [Legacy Source > Thread], but actually the streamTaskThread which represents the whole first > stage is found. > So I checked the Thread dump at once. > !image-2020-10-14-11-41-53-379.png|width=801,height=244! > The > legacy Source Thread > > !image-2020-10-14-11-42-57-353.png|width=846,height=226! > The StreamTask > Thread > > According to the thread dump info and the Exception Message, I searched > and read certain source code and then *{color:#ffab00}DID A TEST{color}* > > {color:#172b4d} Since the Kafka connector is customed, I tried to make the > KafkaSource a serpated stage by changing the TaskChainStrategy to Never. The > task topology as follows:{color} > {color:#172b4d}!image-2020-10-14-11-48-51-310.png|width=753,height=208!{color} > > {color:#505f79}*Fortunately, it did work! No Exception is throwed and > Checkpoint could be snapshot successfully!*{color} > > > So, from my perspective, there shall be something wrong when HiveWritingTask > and LegacySourceTask chained together. the Legacy source task is a seperated > thread, which may be the cause of the exception mentioned above. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20074) ChangelogSourceITCase.testRegularJoin fail
[ https://issues.apache.org/jira/browse/FLINK-20074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-20074: --- Assignee: Jark Wu > ChangelogSourceITCase.testRegularJoin fail > -- > > Key: FLINK-20074 > URL: https://issues.apache.org/jira/browse/FLINK-20074 > Project: Flink > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Jingsong Lee >Assignee: Jark Wu >Priority: Major > Labels: test-stability > Fix For: 1.12.0 > > > INSTANCE: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9379=logs=ae4f8708-9994-57d3-c2d7-b892156e7812=e25d5e7e-2a9c-5589-4940-0b638d75a414 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18139) Unaligned checkpoints checks wrong channels for inflight data.
[ https://issues.apache.org/jira/browse/FLINK-18139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229743#comment-17229743 ] Dian Fu commented on FLINK-18139: - [~AHeise] Could we close this ticket as the PR is already merged? > Unaligned checkpoints checks wrong channels for inflight data. > -- > > Key: FLINK-18139 > URL: https://issues.apache.org/jira/browse/FLINK-18139 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.11.0, 1.12.0 >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > > CheckpointBarrierUnaligner#hasInflightData is not called with input gate > contextual information, such that only the same first few channels are > checked during initial snapshotting of inflight data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-18456) CompressionFactoryITCase.testWriteCompressedFile fails with "expected:<1> but was:<2>"
[ https://issues.apache.org/jira/browse/FLINK-18456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-18456. --- Resolution: Cannot Reproduce Closing this ticket for now as it has not occurred for almost half a year. > CompressionFactoryITCase.testWriteCompressedFile fails with "expected:<1> but > was:<2>" > -- > > Key: FLINK-18456 > URL: https://issues.apache.org/jira/browse/FLINK-18456 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Tests >Affects Versions: 1.12.0 >Reporter: Dian Fu >Priority: Major > Labels: test-stability > > Instance on the master: > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4123=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > 2020-06-30T07:59:56.3844746Z [INFO] Running > org.apache.flink.formats.compress.CompressionFactoryITCase > 2020-06-30T08:00:00.3083101Z [ERROR] Tests run: 1, Failures: 1, Errors: 0, > Skipped: 0, Time elapsed: 3.921 s <<< FAILURE! - in > org.apache.flink.formats.compress.CompressionFactoryITCase > 2020-06-30T08:00:00.3084778Z [ERROR] > testWriteCompressedFile(org.apache.flink.formats.compress.CompressionFactoryITCase) > Time elapsed: 1.191 s <<< FAILURE! > 2020-06-30T08:00:00.3085932Z java.lang.AssertionError: expected:<1> but > was:<2> > 2020-06-30T08:00:00.3086694Z at org.junit.Assert.fail(Assert.java:88) > 2020-06-30T08:00:00.3087435Z at > org.junit.Assert.failNotEquals(Assert.java:834) > 2020-06-30T08:00:00.3088250Z at > org.junit.Assert.assertEquals(Assert.java:645) > 2020-06-30T08:00:00.3089022Z at > org.junit.Assert.assertEquals(Assert.java:631) > 2020-06-30T08:00:00.3090188Z at > org.apache.flink.formats.compress.CompressionFactoryITCase.validateResults(CompressionFactoryITCase.java:106) > 2020-06-30T08:00:00.3091575Z at > org.apache.flink.formats.compress.CompressionFactoryITCase.testWriteCompressedFile(CompressionFactoryITCase.java:90) > 2020-06-30T08:00:00.3092751Z at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2020-06-30T08:00:00.3093687Z at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2020-06-30T08:00:00.3094801Z at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2020-06-30T08:00:00.3095784Z at > java.lang.reflect.Method.invoke(Method.java:498) > 2020-06-30T08:00:00.3096737Z at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2020-06-30T08:00:00.3097863Z at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2020-06-30T08:00:00.3099212Z at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2020-06-30T08:00:00.3100380Z at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2020-06-30T08:00:00.3101557Z at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > 2020-06-30T08:00:00.3102957Z at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > 2020-06-30T08:00:00.3104026Z at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2020-06-30T08:00:00.3104859Z at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-18698) org.apache.flink.sql.parser.utils.ParserResource compile error
[ https://issues.apache.org/jira/browse/FLINK-18698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-18698. --- Resolution: Invalid Closing this ticket as it seems not a bug. > org.apache.flink.sql.parser.utils.ParserResource compile error > -- > > Key: FLINK-18698 > URL: https://issues.apache.org/jira/browse/FLINK-18698 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.12.0 >Reporter: 毛宗良 >Priority: Major > Attachments: image-2020-07-24-11-42-09-880.png > > > org.apache.flink.sql.parser.utils.ParserResource in flink-sql-parser import > org.apache.flink.sql.parser.impl.ParseException which could not find. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-17096) Minibatch Group Agg support state ttl
[ https://issues.apache.org/jira/browse/FLINK-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229738#comment-17229738 ] Jark Wu commented on FLINK-17096: - Yes [~dian.fu]. I will review and merge this pull request these days. > Minibatch Group Agg support state ttl > - > > Key: FLINK-17096 > URL: https://issues.apache.org/jira/browse/FLINK-17096 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Runtime >Affects Versions: 1.9.0, 1.10.0, 1.11.0 >Reporter: dalongliu >Assignee: dalongliu >Priority: Critical > Labels: pull-request-available > Fix For: 1.12.0 > > > At the moment, MiniBatch Group Agg include Local/Global doesn`t support State > TTL, for streaming job, it will lead to OOM in long time running, so we need > to make state data expire after ttl, the solution is that use incremental > cleanup feature refer to FLINK-16581 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19044) Web UI reports incorrect timeline for the FINISHED state of a subtask
[ https://issues.apache.org/jira/browse/FLINK-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-19044: Fix Version/s: 1.12.0 > Web UI reports incorrect timeline for the FINISHED state of a subtask > - > > Key: FLINK-19044 > URL: https://issues.apache.org/jira/browse/FLINK-19044 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.12.0 >Reporter: Caizhi Weng >Priority: Major > Fix For: 1.12.0 > > Attachments: timeline.jpg > > > The timeline for the FINISHED state of a subtask is incorrect. The starting > time of the FINISHED state is larger than its ending time. See the image > below: > !timeline.jpg! > I discover this bug when running the TPCDS benchmark, but I think it can be > reproduced by arbitrary tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19630) Sink data in [ORC] format into Hive By using Legacy Table API caused unexpected Exception
[ https://issues.apache.org/jira/browse/FLINK-19630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229736#comment-17229736 ] Jingsong Lee commented on FLINK-19630: -- [~lirui] I think we can document this. > Sink data in [ORC] format into Hive By using Legacy Table API caused > unexpected Exception > -- > > Key: FLINK-19630 > URL: https://issues.apache.org/jira/browse/FLINK-19630 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.11.2 >Reporter: Lsw_aka_laplace >Priority: Critical > Fix For: 1.12.0 > > Attachments: image-2020-10-14-11-36-48-086.png, > image-2020-10-14-11-41-53-379.png, image-2020-10-14-11-42-57-353.png, > image-2020-10-14-11-48-51-310.png > > > *ENV:* > *Flink version 1.11.2* > *Hive exec version: 2.0.1* > *Hive file storing type :ORC* > *SQL or Datastream: SQL API* > *Kafka Connector : custom Kafka connector which is based on Legacy API > (TableSource/`org.apache.flink.types.Row`)* > *Hive Connector : totally follows the Flink-Hive-connector (we only made some > encapsulation upon it)* > *Using StreamingFileCommitter:YES* > > > *Description:* > try to execute the following SQL: > """ > insert into hive_table (select * from kafka_table) > """ > HIVE Table SQL seems like: > """ > CREATE TABLE `hive_table`( > // some fields > PARTITIONED BY ( > `dt` string, > `hour` string) > STORED AS orc > TBLPROPERTIES ( > 'orc.compress'='SNAPPY', > 'type'='HIVE', > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.delay' = '1 h', > 'sink.partition-commit.policy.kind' = 'metastore,success-file', > ) > """ > When this job starts to process snapshot, here comes the weird exception: > !image-2020-10-14-11-36-48-086.png|width=882,height=395! > As we can see from the message:Owner thread shall be the [Legacy Source > Thread], but actually the streamTaskThread which represents the whole first > stage is found. > So I checked the Thread dump at once. > !image-2020-10-14-11-41-53-379.png|width=801,height=244! > The > legacy Source Thread > > !image-2020-10-14-11-42-57-353.png|width=846,height=226! > The StreamTask > Thread > > According to the thread dump info and the Exception Message, I searched > and read certain source code and then *{color:#ffab00}DID A TEST{color}* > > {color:#172b4d} Since the Kafka connector is customed, I tried to make the > KafkaSource a serpated stage by changing the TaskChainStrategy to Never. The > task topology as follows:{color} > {color:#172b4d}!image-2020-10-14-11-48-51-310.png|width=753,height=208!{color} > > {color:#505f79}*Fortunately, it did work! No Exception is throwed and > Checkpoint could be snapshot successfully!*{color} > > > So, from my perspective, there shall be something wrong when HiveWritingTask > and LegacySourceTask chained together. the Legacy source task is a seperated > thread, which may be the cause of the exception mentioned above. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19253) SourceReaderTestBase.testAddSplitToExistingFetcher hangs
[ https://issues.apache.org/jira/browse/FLINK-19253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-19253: Fix Version/s: 1.12.0 > SourceReaderTestBase.testAddSplitToExistingFetcher hangs > > > Key: FLINK-19253 > URL: https://issues.apache.org/jira/browse/FLINK-19253 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.12.0 >Reporter: Dian Fu >Assignee: Xuannan Su >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6521=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf > {code} > 2020-09-15T10:51:35.5236837Z "SourceFetcher" #39 prio=5 os_prio=0 > tid=0x7f70d0a57000 nid=0x858 in Object.wait() [0x7f6fd81f] > 2020-09-15T10:51:35.5237447Zjava.lang.Thread.State: WAITING (on object > monitor) > 2020-09-15T10:51:35.5237962Z at java.lang.Object.wait(Native Method) > 2020-09-15T10:51:35.5238886Z - waiting on <0xc27f5be8> (a > java.util.ArrayDeque) > 2020-09-15T10:51:35.5239380Z at java.lang.Object.wait(Object.java:502) > 2020-09-15T10:51:35.5240401Z at > org.apache.flink.connector.base.source.reader.mocks.TestingSplitReader.fetch(TestingSplitReader.java:52) > 2020-09-15T10:51:35.5241471Z - locked <0xc27f5be8> (a > java.util.ArrayDeque) > 2020-09-15T10:51:35.5242180Z at > org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) > 2020-09-15T10:51:35.5243245Z at > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:128) > 2020-09-15T10:51:35.5244263Z at > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:95) > 2020-09-15T10:51:35.5245128Z at > org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:42) > 2020-09-15T10:51:35.5245973Z at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > 2020-09-15T10:51:35.5247081Z at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2020-09-15T10:51:35.5247816Z at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > 2020-09-15T10:51:35.5248809Z at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > 2020-09-15T10:51:35.5249463Z at java.lang.Thread.run(Thread.java:748) > 2020-09-15T10:51:35.5249827Z > 2020-09-15T10:51:35.5250383Z "SourceFetcher" #37 prio=5 os_prio=0 > tid=0x7f70d0a4b000 nid=0x856 in Object.wait() [0x7f6f80cfa000] > 2020-09-15T10:51:35.5251124Zjava.lang.Thread.State: WAITING (on object > monitor) > 2020-09-15T10:51:35.5251636Z at java.lang.Object.wait(Native Method) > 2020-09-15T10:51:35.5252767Z - waiting on <0xc298d0b8> (a > java.util.ArrayDeque) > 2020-09-15T10:51:35.5253336Z at java.lang.Object.wait(Object.java:502) > 2020-09-15T10:51:35.5254184Z at > org.apache.flink.connector.base.source.reader.mocks.TestingSplitReader.fetch(TestingSplitReader.java:52) > 2020-09-15T10:51:35.5255220Z - locked <0xc298d0b8> (a > java.util.ArrayDeque) > 2020-09-15T10:51:35.5255678Z at > org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) > 2020-09-15T10:51:35.5256235Z at > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:128) > 2020-09-15T10:51:35.5256803Z at > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:95) > 2020-09-15T10:51:35.5257351Z at > org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:42) > 2020-09-15T10:51:35.5257838Z at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > 2020-09-15T10:51:35.5258284Z at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2020-09-15T10:51:35.5258856Z at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > 2020-09-15T10:51:35.5259350Z at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > 2020-09-15T10:51:35.5260011Z at java.lang.Thread.run(Thread.java:748) > 2020-09-15T10:51:35.5260211Z > 2020-09-15T10:51:35.5260574Z "process reaper" #24 daemon prio=10 os_prio=0 > tid=0x7f6f70042000 nid=0x844 waiting on condition [0x7f6fd832a000] > 2020-09-15T10:51:35.5261036Zjava.lang.Thread.State: TIMED_WAITING > (parking) > 2020-09-15T10:51:35.5261342Z at sun.misc.Unsafe.park(Native Method) > 2020-09-15T10:51:35.5261972Z - parking to wait for <0x815d0810> (a > java.util.concurrent.SynchronousQueue$TransferStack) > 2020-09-15T10:51:35.5262456Z at >
[jira] [Updated] (FLINK-19982) AggregateReduceGroupingITCase.testSingleAggOnTable_SortAgg fails with "RuntimeException: Job restarted"
[ https://issues.apache.org/jira/browse/FLINK-19982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-19982: Fix Version/s: 1.12.0 > AggregateReduceGroupingITCase.testSingleAggOnTable_SortAgg fails with > "RuntimeException: Job restarted" > --- > > Key: FLINK-19982 > URL: https://issues.apache.org/jira/browse/FLINK-19982 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Priority: Major > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/georgeryan1322/Flink/_build/results?buildId=336=logs=a1590513-d0ea-59c3-3c7b-aad756c48f25=5129dea2-618b-5c74-1b8f-9ec63a37a8a6 > {code} > [ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 59.688 s <<< FAILURE! - in > org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase > [ERROR] > testSingleAggOnTable_SortAgg(org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase) > Time elapsed: 2.789 s <<< ERROR! > java.lang.RuntimeException: Job restarted > at > org.apache.flink.streaming.api.operators.collect.UncheckpointedCollectResultBuffer.sinkRestarted(UncheckpointedCollectResultBuffer.java:41) > at > org.apache.flink.streaming.api.operators.collect.AbstractCollectResultBuffer.dealWithResponse(AbstractCollectResultBuffer.java:87) > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:127) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > at java.util.Iterator.forEachRemaining(Iterator.java:115) > at > org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:114) > at > org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:298) > at > org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:138) > at > org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:104) > at > org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:153) > at > org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_SortAgg(AggregateReduceGroupingITCase.scala:122) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > {code} > In the logs, I find occurrences of this: > {code} > 16:37:49,262 [main] WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurs when fetching query results > java.util.concurrent.ExecutionException: > org.apache.flink.runtime.dispatcher.UnavailableDispatcherOperationException: > Unable to get JobMasterGateway for initializing job. The requested operation > is not available while the JobManager is initializing. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > ~[?:1.8.0_242] > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > ~[?:1.8.0_242] > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.sendRequest(CollectResultFetcher.java:163) > ~[flink-streaming-java_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > at > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:134) > [flink-streaming-java_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103) > [flink-streaming-java_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > at > org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77) > [flink-streaming-java_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > at > org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115) > [classes/:?] > at > org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355) > [flink-table-api-java-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > at
[jira] [Updated] (FLINK-19989) Add collect operation in Python DataStream API
[ https://issues.apache.org/jira/browse/FLINK-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-19989: Fix Version/s: 1.13.0 > Add collect operation in Python DataStream API > -- > > Key: FLINK-19989 > URL: https://issues.apache.org/jira/browse/FLINK-19989 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: Dian Fu >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > DataStream.executeAndCollect() has already been supported in FLINK-19508. We > should also support it in the Python DataStream API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20030) Barrier announcement causes outdated RemoteInputChannel#numBuffersOvertaken
[ https://issues.apache.org/jira/browse/FLINK-20030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-20030: Fix Version/s: 1.12.0 > Barrier announcement causes outdated RemoteInputChannel#numBuffersOvertaken > --- > > Key: FLINK-20030 > URL: https://issues.apache.org/jira/browse/FLINK-20030 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Network >Affects Versions: 1.12.0 >Reporter: Arvid Heise >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > {{numBuffersOvertaken}} is set when the announcement is enqueued, but it can > take quite a while until the checkpoint is actually started with quite a > non-priority buffers being drained. > We should move away from {{numBuffersOvertaken}} and use sequence numbers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19630) Sink data in [ORC] format into Hive By using Legacy Table API caused unexpected Exception
[ https://issues.apache.org/jira/browse/FLINK-19630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229732#comment-17229732 ] Lsw_aka_laplace commented on FLINK-19630: - [~dian.fu] Hi, Actually I have solved this problem from my side. But I'm not sure if flink need to do sth to solve this problem cuz locally patch on hive-exec may not be a good solution for everyone. I'm +1 for closing this issue, while you'd better ask [~lzljs3620320] and [~lirui] for this~ Thx~ > Sink data in [ORC] format into Hive By using Legacy Table API caused > unexpected Exception > -- > > Key: FLINK-19630 > URL: https://issues.apache.org/jira/browse/FLINK-19630 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.11.2 >Reporter: Lsw_aka_laplace >Priority: Critical > Fix For: 1.12.0 > > Attachments: image-2020-10-14-11-36-48-086.png, > image-2020-10-14-11-41-53-379.png, image-2020-10-14-11-42-57-353.png, > image-2020-10-14-11-48-51-310.png > > > *ENV:* > *Flink version 1.11.2* > *Hive exec version: 2.0.1* > *Hive file storing type :ORC* > *SQL or Datastream: SQL API* > *Kafka Connector : custom Kafka connector which is based on Legacy API > (TableSource/`org.apache.flink.types.Row`)* > *Hive Connector : totally follows the Flink-Hive-connector (we only made some > encapsulation upon it)* > *Using StreamingFileCommitter:YES* > > > *Description:* > try to execute the following SQL: > """ > insert into hive_table (select * from kafka_table) > """ > HIVE Table SQL seems like: > """ > CREATE TABLE `hive_table`( > // some fields > PARTITIONED BY ( > `dt` string, > `hour` string) > STORED AS orc > TBLPROPERTIES ( > 'orc.compress'='SNAPPY', > 'type'='HIVE', > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.delay' = '1 h', > 'sink.partition-commit.policy.kind' = 'metastore,success-file', > ) > """ > When this job starts to process snapshot, here comes the weird exception: > !image-2020-10-14-11-36-48-086.png|width=882,height=395! > As we can see from the message:Owner thread shall be the [Legacy Source > Thread], but actually the streamTaskThread which represents the whole first > stage is found. > So I checked the Thread dump at once. > !image-2020-10-14-11-41-53-379.png|width=801,height=244! > The > legacy Source Thread > > !image-2020-10-14-11-42-57-353.png|width=846,height=226! > The StreamTask > Thread > > According to the thread dump info and the Exception Message, I searched > and read certain source code and then *{color:#ffab00}DID A TEST{color}* > > {color:#172b4d} Since the Kafka connector is customed, I tried to make the > KafkaSource a serpated stage by changing the TaskChainStrategy to Never. The > task topology as follows:{color} > {color:#172b4d}!image-2020-10-14-11-48-51-310.png|width=753,height=208!{color} > > {color:#505f79}*Fortunately, it did work! No Exception is throwed and > Checkpoint could be snapshot successfully!*{color} > > > So, from my perspective, there shall be something wrong when HiveWritingTask > and LegacySourceTask chained together. the Legacy source task is a seperated > thread, which may be the cause of the exception mentioned above. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20066) BatchPandasUDAFITTests.test_group_aggregate_with_aux_group unstable
[ https://issues.apache.org/jira/browse/FLINK-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-20066. --- Resolution: Fixed This should have been address in FLINK-19842 > BatchPandasUDAFITTests.test_group_aggregate_with_aux_group unstable > --- > > Key: FLINK-20066 > URL: https://issues.apache.org/jira/browse/FLINK-20066 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.12.0 >Reporter: Dian Fu >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=9361=logs=bdd9ea51-4de2-506a-d4d9-f3930e4d2355=98717c4f-b888-5636-bb1e-db7aca25755e > {code} > 2020-11-09T23:41:41.8853547Z === FAILURES > === > 2020-11-09T23:41:41.8854000Z __ > BatchPandasUDAFITTests.test_group_aggregate_with_aux_group __ > 2020-11-09T23:41:41.8854324Z > 2020-11-09T23:41:41.8854647Z self = > testMethod=test_group_aggregate_with_aux_group> > 2020-11-09T23:41:41.8854956Z > 2020-11-09T23:41:41.8855205Z def > test_group_aggregate_with_aux_group(self): > 2020-11-09T23:41:41.8855521Z t = self.t_env.from_elements( > 2020-11-09T23:41:41.8858372Z [(1, 2, 3), (3, 2, 3), (2, 1, 3), > (1, 5, 4), (1, 8, 6), (2, 3, 4)], > 2020-11-09T23:41:41.8858807Z DataTypes.ROW( > 2020-11-09T23:41:41.8859091Z [DataTypes.FIELD("a", > DataTypes.TINYINT()), > 2020-11-09T23:41:41.8859453Z DataTypes.FIELD("b", > DataTypes.SMALLINT()), > 2020-11-09T23:41:41.8859783Z DataTypes.FIELD("c", > DataTypes.INT())])) > 2020-11-09T23:41:41.8860012Z > 2020-11-09T23:41:41.8860255Z table_sink = > source_sink_utils.TestAppendSink( > 2020-11-09T23:41:41.8863051Z ['a', 'b', 'c', 'd'], > 2020-11-09T23:41:41.8863523Z [DataTypes.TINYINT(), > DataTypes.INT(), DataTypes.FLOAT(), DataTypes.INT()]) > 2020-11-09T23:41:41.8864048Z > self.t_env.register_table_sink("Results", table_sink) > 2020-11-09T23:41:41.8864715Z > self.t_env.get_config().get_configuration().set_string('python.metric.enabled', > 'true') > 2020-11-09T23:41:41.8865161Z self.t_env.register_function("max_add", > udaf(MaxAdd(), > 2020-11-09T23:41:41.8865573Z > result_type=DataTypes.INT(), > 2020-11-09T23:41:41.8865999Z > func_type="pandas")) > 2020-11-09T23:41:41.8866426Z > self.t_env.create_temporary_system_function("mean_udaf", mean_udaf) > 2020-11-09T23:41:41.8866759Z t.group_by("a") \ > 2020-11-09T23:41:41.8867052Z .select("a, a + 1 as b, a + 2 as c") > \ > 2020-11-09T23:41:41.8867352Z .group_by("a, b") \ > 2020-11-09T23:41:41.8867660Z .select("a, b, mean_udaf(b), > max_add(b, c, 1)") \ > 2020-11-09T23:41:41.8868026Z > .execute_insert("Results") \ > 2020-11-09T23:41:41.8868293Z .wait() > 2020-11-09T23:41:41.8868446Z > 2020-11-09T23:41:41.8868704Z pyflink/table/tests/test_pandas_udaf.py:95: > 2020-11-09T23:41:41.8869077Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > 2020-11-09T23:41:41.8869464Z pyflink/table/table_result.py:76: in wait > 2020-11-09T23:41:41.8870150Z get_method(self._j_table_result, "await")() > 2020-11-09T23:41:41.8870850Z > .tox/py37-cython/lib/python3.7/site-packages/py4j/java_gateway.py:1286: in > __call__ > 2020-11-09T23:41:41.8871415Z answer, self.gateway_client, self.target_id, > self.name) > 2020-11-09T23:41:41.8871768Z pyflink/util/exceptions.py:147: in deco > 2020-11-09T23:41:41.8872032Z return f(*a, **kw) > 2020-11-09T23:41:41.8872378Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > 2020-11-09T23:41:41.8872629Z > 2020-11-09T23:41:41.8872978Z answer = 'xro24238' > 2020-11-09T23:41:41.8873296Z gateway_client = > > 2020-11-09T23:41:41.8873792Z target_id = 'o24237', name = 'await' > 2020-11-09T23:41:41.8874097Z > 2020-11-09T23:41:41.8874433Z def get_return_value(answer, gateway_client, > target_id=None, name=None): > 2020-11-09T23:41:41.8874893Z """Converts an answer received from the > Java gateway into a Python object. > 2020-11-09T23:41:41.8875212Z > 2020-11-09T23:41:41.8875515Z For example, string representation of > integers are converted to Python > 2020-11-09T23:41:41.8875922Z integer, string representation of > objects are converted to JavaObject > 2020-11-09T23:41:41.8876255Z instances, etc. > 2020-11-09T23:41:41.8876584Z > 2020-11-09T23:41:41.8876873Z
[GitHub] [flink] flinkbot edited a comment on pull request #14021: [FLINK-13733][connector/kafka] Make FlinkKafkaInternalProducerITCase more robust
flinkbot edited a comment on pull request #14021: URL: https://github.com/apache/flink/pull/14021#issuecomment-724820662 ## CI report: * 685a334e4b6fa71ee9a95c74aaf3b7ba2557d05e Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9417) * 6b1819154e14ebc0c9525536fba5757b136080bc Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9439) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org